Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use AllocHGlobal in Pkcs12Reader #97447

Closed
wants to merge 6 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ internal abstract class UnixPkcs12Reader : IDisposable
private ContentInfoAsn[]? _safeContentsValues;
private CertAndKey[]? _certs;
private int _certCount;
private PointerMemoryManager<byte>? _tmpManager;
private Memory<byte> _tmpMemory;
private bool _allowDoubleBind;

protected abstract ICertificatePalCore ReadX509Der(ReadOnlyMemory<byte> data);
Expand All @@ -39,19 +39,13 @@ protected void ParsePkcs12(ReadOnlySpan<byte> data)

// Windows compatibility: Ignore trailing data.
ReadOnlySpan<byte> encodedData = reader.PeekEncodedValue();
byte[] dataWithoutTrailing = GC.AllocateUninitializedArray<byte>(encodedData.Length, pinned: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I have the same concerns here as I did with the other I reviewed. The pinned object heap is not designed for short-lived objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 This is replacing memory leak with a perf bug that can be worse than the memory leak.

Can you wrap the unmanaged memory in a safe handle and AddRef/Release the safe handle around accesses?

Copy link
Member Author

@krwq krwq Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a perf expert but this is not a hot path by any means - I suppose all operations done later with cert are way heavier than any cost of this specific allocation/freeing - note there is also lots of parsing going on even within constructor but that's without doing any actual crypto related stuff.

I honestly think we should either change code so that pinning is not needed at all or invest in changes which make this construct fast (I'd imagine passing bytes to native code isn't as uncommon).

I personally greatly prefer simplicity to small perf gains but if there is evidence this is as bad then maybe it is worth it.

When I run this in a tight loop (that was just ctor not E2E) with different cert sizes OOM was very visible while this potential perf issue wasn't particularly noticable (but I didn't measure the throughput) - I'm having a really hard time seeing how it can be worse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't about the perf of this code. It's about the effect it has on the rest of the application.

Copy link
Member

@stephentoub stephentoub Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about the effect it has on the rest of the application.

Just to demonstrate what I'm talking about, here it's not the Mean that I care most about (though that it is significantly slower and reflective of the impact this has on not only this code but other code in the process as well), but more the fact that in the non-pinned version there are 0 gen1 and 0 gen2 GCs, and that's no longer true in the pinned version: the POH gets collected as part of gen2, so short-lived allocations that would otherwise be gen0 become gen2, while also leading to excessive fragmentation of the POH, and that affects everything else in the app.

Method Length Mean Error Ratio Gen0 Gen1 Gen2
NotPinned 10 7.054 ns 0.0653 ns 1.00 0.0004 - -
Pinned 10 234.552 ns 4.7442 ns 33.26 0.0014 0.0014 0.0014
NotPinned 1000 78.747 ns 0.7053 ns 1.00 0.0111 - -
Pinned 1000 1,175.699 ns 23.2682 ns 15.03 0.0267 0.0267 0.0267
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[MemoryDiagnoser]
public class Tests
{
    [Params(10, 1000)]
    public int Length { get; set; }

    [Benchmark(Baseline = true)]
    public byte[] NotPinned() => GC.AllocateUninitializedArray<byte>(Length, pinned: false);

    [Benchmark]
    public byte[] Pinned() => GC.AllocateUninitializedArray<byte>(Length, pinned: true);
}

Copy link
Member Author

@krwq krwq Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity - does allocating unpinned and then using scoped fixed statement make any difference?

Copy link
Member

@jkotas jkotas Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity - does allocating unpinned and then using scoped fixed statement make any difference?

Yes, that would work just fine in this case. However, I am not sure whether it is the right choice for this code. I assume that it allocates pinned memory to prevent GC from creating copies of the certificate in process memory when it compacts the heap. Comment that explains why this allocates pinned memory would be nice.

Why aren't pinned objects (or at least arrays) implemented in terms of GlobalHAlloc?

It would not be possible. GlobalHAlloc allocations are not managed by the GC. You have to free the allocation explicitly. Pinned object heap allocations are managed by the GC. They are collected when no longer referenced, there is no explicit free operation for you to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the pinned object heap would not have these perf characteristics and its performance characteristics would be more like GlobalHAlloc. There is nothing fundamental preventing that. It is not how it behaves today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that it allocates pinned memory to prevent GC from creating copies of the certificate in process memory when it compacts the heap.

Yep. Well, the concern is with the private key material instead of the certificates per se, but the point is to avoid GC compaction clones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub @jkotas @bartonjs I've updated the code back to the old version and I'm using SafeHandle now, please review

encodedData.CopyTo(dataWithoutTrailing);
_tmpMemory = MemoryMarshal.CreateFromPinnedArray(dataWithoutTrailing, 0, dataWithoutTrailing.Length);

unsafe
{
IntPtr tmpPtr = Marshal.AllocHGlobal(encodedData.Length);
Span<byte> tmpSpan = new Span<byte>((byte*)tmpPtr, encodedData.Length);
encodedData.CopyTo(tmpSpan);
_tmpManager = new PointerMemoryManager<byte>((void*)tmpPtr, encodedData.Length);
}

ReadOnlyMemory<byte> tmpMemory = _tmpManager.Memory;
reader = new AsnValueReader(tmpMemory.Span, AsnEncodingRules.BER);
reader = new AsnValueReader(_tmpMemory.Span, AsnEncodingRules.BER);

PfxAsn.Decode(ref reader, tmpMemory, out PfxAsn pfxAsn);
PfxAsn.Decode(ref reader, _tmpMemory, out PfxAsn pfxAsn);

if (pfxAsn.AuthSafe.ContentType != Oids.Pkcs7Data)
{
Expand Down Expand Up @@ -112,26 +106,8 @@ internal IEnumerable<CertAndKey> EnumerateAll()

public void Dispose()
{
// Generally, having a MemoryManager cleaned up in a Dispose is a bad practice.
// In this case, the UnixPkcs12Reader is only ever created in a using statement,
// never accessed by a second thread, and there isn't a manual call to Dispose
// mixed in anywhere.
if (_tmpManager != null)
{
unsafe
{
Span<byte> tmp = _tmpManager.GetSpan();
CryptographicOperations.ZeroMemory(tmp);

fixed (byte* ptr = tmp)
{
Marshal.FreeHGlobal((IntPtr)ptr);
}
}

((IDisposable)_tmpManager).Dispose();
_tmpManager = null;
}
CryptographicOperations.ZeroMemory(_tmpMemory.Span);
_tmpMemory = Memory<byte>.Empty;

ContentInfoAsn[]? rentedContents = Interlocked.Exchange(ref _safeContentsValues, null);
CertAndKey[]? rentedCerts = Interlocked.Exchange(ref _certs, null);
Expand Down
Loading