Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SslStream API improvements for enhanced use cases #37933

Closed
wfurt opened this issue Jun 15, 2020 · 28 comments
Closed

SslStream API improvements for enhanced use cases #37933

wfurt opened this issue Jun 15, 2020 · 28 comments
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Net.Security
Milestone

Comments

@wfurt
Copy link
Member

wfurt commented Jun 15, 2020

Background and Motivation

This covers several use cases discussed in various runtime, aspnetcore and YARP threads.
This allow to implement new functionality as well as it allows callers to control some internal aspects of SslStream and X509Chain used internally.

Proposed API

namespace System.Security
{
+    public sealed class SslStreamCertificateContext
+    {
+          public static SslStreamCertificateContext CreateForServer(
+            X509Certificate2 target,
+            X509Certificate2Collection? additionalCertificates); 
+    }

     public delegate bool RemoteCertificateValidationCallback(object sender, X509Certificate certificate, X509Chain chain, SslPolicyErrors sslPolicyErrors);
     public delegate X509Certificate ServerCertificateSelectionCallback(object sender, string hostName);
+    public delegate ValueTask<SslServerAuthenticationOptions> ServerOptionsSelectionCallback(SslStream sender, string hostName, CancellationToken cancellationToken);

     public class SslServerAuthenticationOptions
     {
         RemoteCertificateValidationCallback RemoteCertificateValidationCallback { get; set; }
         X509Certificate ServerCertificate { get; set; }
         ServerCertificateSelectionCallback ServerCertificateSelectionCallback { get; set; };
+        SslStreamCertificateContext ServerCertificateContext  { get; set; };
     }

      public class SslStream 
      {
          public Task AuthenticateAsServerAsync(SslServerAuthenticationOptions sslServerAuthenticationOptions, CancellationToken cancellationToken = default);
          public void AuthenticateAsServer(SslServerAuthenticationOptions sslServerAuthenticationOptions);

          // Add overloaded with delegate returning SslServerAuthenticationOptions instead of passing SslServerAuthenticationOptions in.
+         public Task AuthenticateAsServerAsync(ServerOptionsSelectionCallback optionCallback, CancellationToken cancellationToken = default);
          // property to show name requested by client
+         public virtual string HostName { get; }
    }
}

SslStreamCertificateContext
There are cases when caller passes server certificate to AuthenticateAsServer(). That is single certificate and SslStream is trying to build complete chain, possibly doing synchronous HTTP calls behind the scenes. There is no coordination or visibility and that can create issues especially when the sites are not reachable or flaky. The intermediates are caches on disk, but that may not work as expected with docker and Kuberneties.

We talk about alternatives (like using just X509CertificateCollection or X509Chain) and decide to go with proposal of opaque object. That gives use option to extend it later and use is as state holder later when we need to add OCSP support, session cache and possibly other features.

This adds third (and hopefully last) way how to specify server certificate. e.g. new full context, existing X509Certificate and existing ServerCertificateSelectionCallback().

ServerOptionsSelectionCallback
There is desire to configure more properties (like SslProtocols) per specific server-name.
Great examples may be Kestrel and YARP when all HTTP servers may be on the same port yet expecting different policies. Perhaps legacy server allowing Tls1.x and more secured one requiring Tls12 or above. Since the server name client is trying to connect to is not known until first message from client, this is currently very difficult to implement. There are cases when users would sniff and parse incoming TLS frames to get the information soon enough.

The other use case new delegate is trying to solve is better ability to perform IO in selection callback. Current delegate is synchronous and that can be a problem if there is need to fetch certificate from database of key vault. There were also cases, when somebody want to inject audit logs to database and that leads to Async -> Sync -> Async path.

With proposed ValueTask, the callback can finish synchronously or synchronously, return one certificate or whole context and it can also set crypto parameters one would normally set on SslServerAuthenticationOptions.
#31097
#35844
dotnet/aspnetcore#21300

HostName
When AuthenticateAsClient() is called, it is mandatory that TargetHost is set. (either via SslClientAuthenticationOptions or via some convenience overloads). However, when RemoteCertificateValidationCallback() runs, there is no easy way to know what target SslStream was trying to connect to. We already store that information internally in SslStream and this would expose it as public property.

On server, this property holds name requested by client or it is empty string if SNI extension was not used.

The name can be DNS name in dot notation or it can be IP address as string.
Alternatively we can name this TargetHost to be consistent with terminology in SslClientAuthenticationOptions.
#27619

Alternative Designs

Since we have ServerCertificateSelectionCallback we could add ServerOptionsSelectionCallback there as well. It feels strange to have callback to select options part of the option.

There was some discussion around name HostName property.

Right now, the asks are to make properties configurable based on SNI (server name). If we want to make it more extensible for future we can consider one of the following:

  1. pass ClientHello in as raw data
    ReadOnlySpan<byte> clientHello That would allows callback to process any other extension from the request.

  2. use some container like:

public delegate ValueTask<SslServerAuthenticationOptions> ServerOptionsSelectionCallback(SslStream sender, SslClientHelloInfo clientHello, CancellationToken cancellationToken);

readonly struct SslClientHelloInfo
{
    string ServerName;
    SslProtocols SslProtocols;
    //maybe
    List<SslApplicationProtocol> ApplicationProtocols;
    // maybe
    ReadOnlySpan<byte> RawClientHello;
}

HTTP/2 specification requires certain ciphers to be blocked and this would be path to fulfill that expectation. e.g. If client asks for h2 and we are willing to do that, cipher policy could be used on platforms with support. (so as we can disable renegotiation)

The caveat is that ALPN is list so we would need to allocate. That may be negligible comparing to cost of crypto in SSL/TLS handshake.

Usage Examples

The expectation is that this can be used with PFX files or to-be-approved #31944

Caller can get and verify full chain once (when possible) and then use across multiple SslStream. If this needs to be done per server name, once would use AuthenticateAsServerAsync(optionCallback, cancellationToken);

    SslStreamCertificateContext CretateCertificateContext(ReadOnlySpan<char> certificate, ReadOnlySpan<char> certPem key, ReadOnlySpan<char> chainPem)
    {
            X509Certificate2 serverCertificate =  X509Certificate2.CreateFromPem(certificate, key);
            X509Certificate2Collection extra = new X509Certificate2Collection(serverCertificate);
            extra.ImportFromPem(chainPem)
 
            X509Chain chain = new X509Chain()
            chain.ChainPolicy.ApplicationPolicy.Add(serverAuthOid);
            chain.ChainPolicy.RevocationMode = X509RevocationMode.NoCheck;
            chain.ChainPolicy.DisableCertificateFetch = true;
            foreach (X509Certificate intermediate in extra)
            {
                chain.ChainPolicy.ExtraStore.Add(intermediate);
            }
             
            if (chain.Build())
            {
                // We have full chain in PEM data
                return SslStreamCertificateContext.CreateForServer(serverCertificate, chain.ChainElements);
            }
 
            // Log warning ???
            // Try to build chain with network access.
            chain.ChainPolicy.DisableCertificateFetch = false;
            chain.ChainPolicy.UrlRetrievalTimeout = TimeSpan.FromSeconds(_maxFetchTime);
            if (chain.Build())
            {
                // Build was successful. Intermediate certificates will be cached now.
                return SslStreamCertificateContext.CreateForServer(serverCertificate, chain.ChainElements);
            }
 
            // Failed to fetch chain parts.
            // do you favorite error mitigation here
            return null;
    }
 
    void PrepareServer(String name, ReadOnlySpan<char> certificate, ReadOnlySpan<char> certPem key, ReadOnlySpan<char> chain)
    {
            var options = new SslServerAuthenticationOptions()
            // customize options per site
            options.ServerCertificateContext = CretateCertificateContext(certificate, key, chain);
            _configurations.Add(name, options);
    }
 
    ValueTask<SslServerAuthenticationOptions> ServerOptionsSelectionCallback(SslStream sender, string hostname, CancellationToken cancellationToken)
    {
            if (!_configurations.TryGet(name, out SslServerAuthenticationOptions options))
            {
                // unknown site?
                return ValueTask.FromException(new Exception());
            }
 
            return new ValueTask<SslServerAuthenticationOptions>(options);
    }

Risks

This is primarily meant for advanced use cases. It feels like it would be easy to mess up in async callback.

This should be pretty much what we agreed on @bartonjs @davidfowl @halter73
Let me know if you think it is ready for API review.

@wfurt wfurt added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Net.Security labels Jun 15, 2020
@ghost
Copy link

ghost commented Jun 15, 2020

Tagging subscribers to this area: @dotnet/ncl
Notify danmosemsft if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jun 15, 2020
@scalablecory
Copy link
Contributor

Your proposed API is messed up, looks like some methods are outside of SslStream.

@scalablecory
Copy link
Contributor

Are there any other interesting parameters in the client hello that we might want to pass (maybe not now, but in future) beyond the hostName that is in the new delegate? Would it make more sense to pass a new class instead that we could add additional properties to later?

@scalablecory
Copy link
Contributor

Having the sync API use an async callback isn't great. Can a second delegate be added for sync?

@scalablecory scalablecory added this to the 5.0.0 milestone Jun 16, 2020
@scalablecory scalablecory removed the untriaged New issue has not been triaged by the area owner label Jun 16, 2020
@halter73
Copy link
Member

halter73 commented Jun 16, 2020

Considering this a new server-only API, would we be OK with not adding a sync version of AuthenticateAsServerAsync(ServerOptionsSelectionCallback, CancellationToken)?

@davidfowl
Copy link
Member

I think @scalablecory means and overload that takes a synchronous callback as well

@halter73
Copy link
Member

That's how I understood @scalablecory's comment as well, but no need to have a overload with a sync callback if there's no sync overload.

@vcsjones
Copy link
Member

Just so I am clear, there is no expectation that X509Chain has to be used? It seems that the API allows for this, so I'm just making sure I understand correctly.

If I am saying "here's my leaf, here's an ordered set of additional intermediates" I would not expect X509Chain to be used at all - I would expect the server to just accept it as-is and send it down in the Certificate certificate_list part of the handshake (after ServerHello). Path-building, revocation, trust, policy, etc all the responsibilities of the client / user agent.

@bartonjs
Copy link
Member

Just so I am clear, there is no expectation that X509Chain has to be used

My thoughts, at least, are that it's used once, when this object is created, so that it can normalize the inputs (e.g. remove duplicates, make sure the order is correct, etc).

@wfurt
Copy link
Member Author

wfurt commented Jun 16, 2020

I added AuthenticateAsServer(ServerOptionsSelectionCallback optionCallback) as we just added client sync complements. We can certainly do that later. As far as the delegate, I was expecting that on synchronous path, that would finish synchronously returning the "value" part of ValueTask. That is still possible even with the AuthenticateAsServerAsync is everything is ready in memory.

@scalablecory
Copy link
Contributor

I added AuthenticateAsServer(ServerOptionsSelectionCallback optionCallback) as we just added client sync complements.

Yea, I think a sync server is unlikely, but as we just decided to invest in the other sync overload, I can see us adding this here to be consistent.

I was expecting that on synchronous path, that would finish synchronously returning the "value" part of ValueTask.

Ehh. The usability isn't very idiomatic there -- I think two delegates would be best. Also, what happens if I do something weird like call the sync method with a callback that is actually async?

@scalablecory
Copy link
Contributor

Are the options ServerCertificateContext and ServerCertificate essentially setting the same option? The former is just a "prepared" version of the latter with additional certs?

@wfurt
Copy link
Member Author

wfurt commented Jun 16, 2020

I kick the sync overload out. It seems like we can add it if there is real need.

There are already three ways how certificate can be selected: either as property passed in, using ServerCertificateSelectionCallback from options or using LocalCertificateSelectionCallback on SslStream it self. With this proposal we are also adding option with SslStreamCertificateContext. Assuming advanced use, I did not want to add too many overload but do one capable of all the functions.

as far as the context:
The difference is that with just the certificate we build chain every time (and may do network IO) and there is really no state between calls. The ServerCertificateContext allows to carry state between streams and attach things like OCSP to it.

@scalablecory
Copy link
Contributor

The difference is that with just the certificate we build chain every time (and may do network IO)

Should SslStreamCertificateContext.CreateForServer be made async if it can do network I/O?

Also, would a client version make sense so we could cache such an object at the pool level in SocketsHttpHandler?

@bartonjs
Copy link
Member

Should SslStreamCertificateContext.CreateForServer be made async if it can do network I/O?

No, because the generalization is that the OS is what does the I/O, but within a synchronous native function. (The I/O is considered an implementation detail of the OS function.)

@vcsjones
Copy link
Member

My thoughts, at least, are that it's used once, when this object is created, so that it can normalize the inputs

That certainly alleviates a few common misconfigurations of HTTPS.

On the other hand I could see "I want to start up as fast as possible". Putting mTLS aside, most web servers do very minimal things with X509. Maybe enough X509 parsing to check public / private key equality and to get the SANs off the cert so that it can pick the right cert in an SNI scenario, but that's about it.

If it has no bearing on API shape I won't pollute the issue further, I just wanted to know if that scenario is possible with this.

@davidfowl
Copy link
Member

@vcsjones yes BUT it's up to the us to decide what behavior we do in the server.

@wfurt
Copy link
Member Author

wfurt commented Jun 16, 2020

For that reason #37485 was created @vcsjones. So one can try to build the chain without any IO and decide what to do - error out, move forward with incomplete chain or try to fetch missing bits.
If given SslStreamCertificateContext has all needed parts, there will be not need for SslStream or X509Chain inside to fetch any bits.

@vcsjones
Copy link
Member

@bartonjs

normalize the inputs

My thought, and my concern may be misplaced, is that in this case the platform building the chain for the server forms an opinion on the certificate path when really a certificate has multiple trust paths.

Consider this (real) certificate chain:

AGH

D is cross signed. If the chain is built on a server as Leaf → E → D:root, but some old Android client can only build Leaf → E → D:intermediate → C, and the server only sent the "seen" intermediates in the chain, then D:intermediate would get removed from the sent certificates even though older clients might want it.

Am I misunderstanding?

@bartonjs
Copy link
Member

Yeah, multiple paths always mess up tidy logic.

My gut says that the easy solution here should be the normalizing one. Though perhaps we need an overload for CreateAndIKnowWhatImDoing, which would just do garbage-in/garbage-out. The name, of course, could use work.

@wfurt
Copy link
Member Author

wfurt commented Jun 16, 2020

Do we have test cases like this @bartonjs? If not and if this is real scenario perhaps we can turn it into test case...

@karelz
Copy link
Member

karelz commented Jun 16, 2020

NCL discussion:

  • Consider Create instead of CreateForServer as it is agnostic of client/server
  • Drop duplicate hostName arg in ServerOptionsSelectionCallback (it is already in SslStream)
  • Should we change HostName to a class (future proof if there is more info in future - like ALPN, SNI, etc.)

Overall ready for API review

@wfurt wfurt added api-ready-for-review and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Jun 16, 2020
@wfurt wfurt added the blocking Marks issues that we want to fast track in order to unblock other important work label Jun 16, 2020
@wfurt
Copy link
Member Author

wfurt commented Jun 16, 2020

I added section to alternative designs.

@bartonjs
Copy link
Member

Do we have test cases [for multi-path chains]?

Not really, though we could technically do it now that we support contextual roots. We just let the OS libraries build a chain, and they usually prefer (to my understanding) the first of

  • The shortest path to a trusted root
  • (Maybe?) The shortest path to a self-signed certificate
  • The longest path it could find

So if the shortest trusted path is 11 hops and the shortest self-signed is 3, 11 is what gets returned, but once you trust the root of the 3-element chain the 11 element version doesn't exist.

It pops up when new CAs come online, because they sometimes get "cross-certified" by existing CAs. As their trust rolls out no one cares about the cross-certification path (@vcsjones' example) and everyone generally pretends it doesn't exist (still applicable to that example 😄). A cross-certified chain was the problem in #17550 (where we ended up forcing OpenSSL to build a longer chain that it didn't trust, but we've since redefined our chain builder to let OpenSSL understand what success means, now it's back to the short trusted one by preference), so it was the opposite of the example above... but that was just a bug on our side.

if this is real scenario

It... usually isn't. But when it is, it's a real pain for the people affected. So it's probably good to consider it here. As I said before, I think that we probably want the easy and recommended path to be that we normalize things for you (after all, I don't think we get a choice on Windows), but if we want to have a power-user "we'll do whatever you say" variant, that's now expressible via the opaque type. But I'll leave "what do we do on Windows" as an exercise left to the implementer.

@vcsjones
Copy link
Member

It pops up when new CAs come online [...] As their trust rolls out

Well, this takes several years because a lot of old devices (cough *android* cough) don't get updates. As an example, Let's Encrypt has delayed moving off their cross-signed root to their ISRG again.

I suppose this is how all of this works on Windows / .NET today, and if there haven't been a lot of customer reported issues from it, then my concerns may not have much merit. 🤷‍♂️

@bartonjs
Copy link
Member

(Sorry about the close/reopen, accidental click)

@bartonjs
Copy link
Member

bartonjs commented Jun 18, 2020

Video

  • ServerOptionsSelectionCallback should use a context object instead of the hostname for future expansion
  • Rename sender to stream in ServerOptionsSelectionCallback
  • Why is SslStream.HostName virtual?
  • Rename HostName to TargetHostName
  • Rename SslStreamCertificateContext.CreateForServer to SslStreamCertificateContext.Create to unify with client usage (and drop the usage type check)
  • Add an offline parameter to the SslStreamCertificateContext Create method
  • Add a debugger proxy to show the target cert.
public readonly struct SslClientHelloInfo
{
    public string ServerName { get; }
    public SslProtocols SslProtocols { get; }
}

public delegate ValueTask<SslServerAuthenticationOptions> ServerOptionsSelectionCallback(SslStream stream, SslClientHelloInfo clientHelloInfo, object? state, CancellationToken cancellationToken);

partial class SslStream
{
    public string TargetHostName { get; }
    public Task AuthenticateAsServerAsync(
        ServerOptionsSelectionCallback optionCallback,
        object? state,
        CancellationToken cancellationToken = default);      
}

public sealed class SslStreamCertificateContext
{
    public static SslStreamCertificateContext Create(
        X509Certificate2 target,
        X509Certificate2Collection? additionalCertificates,
        bool offline = false);
}

partial class SslServerAuthenticationOptions
{
    SslStreamCertificateContext ServerCertificateContext  { get; set; };
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review blocking Marks issues that we want to fast track in order to unblock other important work labels Jun 18, 2020
@wfurt
Copy link
Member Author

wfurt commented Jul 15, 2020

all API surface is available now.

@wfurt wfurt closed this as completed Jul 15, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Net.Security
Projects
None yet
Development

No branches or pull requests

8 participants