Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HttpHeaders.NonValidated #53555

Merged
merged 1 commit into from
Jun 6, 2021

Conversation

stephentoub
Copy link
Member

This adds an HttpHeaders.NonValidated property, which returns a type that provides a non-validating / non-parsing / non-allocating view of headers in the collection. Querying the resulting collection does not force parsing or validation on the contents of the headers, handing back exactly the raw data that it contains; if a header doesn't contain a raw value but instead contains an already parsed value, a string representation of that header value(s) is returned. When using the strongly-typed members, querying and enumeration is allocation-free, unless strings need to be created to represent already parsed values.

Fixes #35126 (comment)
cc: @geoffkizer, @scalablecory

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Jun 1, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

This adds an HttpHeaders.NonValidated property, which returns a type that provides a non-validating / non-parsing / non-allocating view of headers in the collection. Querying the resulting collection does not force parsing or validation on the contents of the headers, handing back exactly the raw data that it contains; if a header doesn't contain a raw value but instead contains an already parsed value, a string representation of that header value(s) is returned. When using the strongly-typed members, querying and enumeration is allocation-free, unless strings need to be created to represent already parsed values.

Fixes #35126 (comment)
cc: @geoffkizer, @scalablecory

Author: stephentoub
Assignees: -
Labels:

area-System.Net.Http

Milestone: 6.0.0

@stephentoub
Copy link
Member Author

Quick benchmark:

Method Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
EnumerateOld 83.92 us 1.568 us 1.390 us 1.00 0.00 0.4883 - - 3 KB
EnumerateNew 68.01 us 0.763 us 0.713 us 0.81 0.02 0.3662 - - 2 KB
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Net.Sockets;
using System.Runtime.CompilerServices;
using System.Text;
using System.Threading.Tasks;

[module: SkipLocalsInit]

[MemoryDiagnoser]
public class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssemblies(new[] { typeof(Program).Assembly }).Run(args);

    [Benchmark(Baseline = true)]
    public async Task EnumerateOld()
    {
        var request = new HttpRequestMessage(HttpMethod.Get, s_uri);
        using var resp = await s_client.SendAsync(request, default);
        foreach (var header in resp.Headers) { }
        foreach (var contentHeader in resp.Content.Headers) { }
        await resp.Content.CopyToAsync(Stream.Null);
    }

    [Benchmark]
    public async Task EnumerateNew()
    {
        var request = new HttpRequestMessage(HttpMethod.Get, s_uri);
        using var resp = await s_client.SendAsync(request, default);
        foreach (var header in resp.Headers.NonValidated) { }
        foreach (var contentHeader in resp.Content.Headers.NonValidated) { }
        await resp.Content.CopyToAsync(Stream.Null);
    }

    private static readonly Socket s_listener = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
    private static readonly HttpMessageInvoker s_client = new HttpMessageInvoker(new SocketsHttpHandler { UseProxy = false, UseCookies = false, AllowAutoRedirect = false });
    private static Uri s_uri;

    [GlobalSetup]
    public void CreateSocketServer()
    {
        s_listener.Bind(new IPEndPoint(IPAddress.Loopback, 0));
        s_listener.Listen(int.MaxValue);
        var ep = (IPEndPoint)s_listener.LocalEndPoint;
        s_uri = new Uri($"http://{ep.Address}:{ep.Port}/");
        byte[] response = Encoding.UTF8.GetBytes("HTTP/1.1 200 OK\r\nDate: Tue, 01 Jul 2021 12:00:00 GMT \r\nServer: Example\r\nAccess-Control-Allow-Credentials: true\r\nAccess-Control-Allow-Origin: *\r\nConnection: keep-alive\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 5\r\n\r\nHello");
        byte[] endSequence = new byte[] { (byte)'\r', (byte)'\n', (byte)'\r', (byte)'\n' };

        Task.Run(async () =>
        {
            while (true)
            {
                Socket s = await s_listener.AcceptAsync();
                _ = Task.Run(() =>
                {
                    using (var ns = new NetworkStream(s, true))
                    {
                        byte[] buffer = new byte[1024];
                        int totalRead = 0;
                        while (true)
                        {
                            int read = ns.Read(buffer, totalRead, buffer.Length - totalRead);
                            if (read == 0) return;
                            totalRead += read;
                            if (buffer.AsSpan(0, totalRead).IndexOf(endSequence) == -1)
                            {
                                if (totalRead == buffer.Length) Array.Resize(ref buffer, buffer.Length * 2);
                                continue;
                            }

                            ns.Write(response, 0, response.Length);

                            totalRead = 0;
                        }
                    }
                });
            }
        });
    }
}

@stephentoub stephentoub force-pushed the nonvalidatedheaders branch from 3df0c42 to 55fe14b Compare June 1, 2021 22:08
@geoffkizer
Copy link
Contributor

@alnikola You worked on this in the last release, can you review this PR?

Copy link
Contributor

@alnikola alnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good under the assumption that we don't preserve the original raw value received from the wire in case it has been overwritten.

@stephentoub
Copy link
Member Author

we don't preserve the original raw value received from the wire in case it has been overwritten

Right. I was initially hoping we could preserve the original, and I think we could find a way to do so selectively for some headers, but for ones that produce a mutable object, there's not a great way to reliably do so. I'd rather see us separately tackle making it so that SocketsHttpHandler doesn't access strongly-typed members that overwrite the raw value.

@geoffkizer
Copy link
Contributor

For future reference, I think there are a couple things we could do here to better preserve the original raw header values.

The biggest concern, I think, is internal access to strongly-typed properties like Content-Length. This forces the parse and thus loses the raw value today. We could instead add something like GetParsedValueNoStore that parses the value, but doesn't store it, and thus leaves the raw value in the store. This would mean that if the user accessed the property, it would get re-parsed, unlike today, but I think that's fine.

Additionally, there's at least one case I know of today (User-Agent for proxy tunnel) where we are reading a parsed header value, but we don't care about the actual value, we are just copying it. In this case we should not access the typed property at all, and instead retrieve the raw value.

The case where the user retrieves typed properties and thus loses the original raw value is trickier. For a decent number of headers, there's no issue with mutability and so we could know if/when the strongly typed value changes and either write through or produce on demand the new raw value, while preserving the original raw value if there is no change made.

Even for cases with mutability, we could potentially solve this by maintaining a copy of the "original parsed values" and detecting changes by comparing that against the new parsed values. That's a fair amount of work, though, and would add some additional overhead in storing and processing the values.

@geoffkizer
Copy link
Contributor

Here's the User-Agent issue I mentioned above: #51583

It's marked as closed, but I still think we have an issue here in the proxy tunnel case.

@geoffkizer
Copy link
Contributor

Beyond this PR: I wonder if we should make this collection mutable. We already have TryAddWithoutValidation. If we were designing this from scratch, I think we'd have TryAdd on this collection rather than its own standalone method.

Not a priority, just something to consider. I don't think there's anything here that would preclude doing that in the future, right?

@stephentoub
Copy link
Member Author

stephentoub commented Jun 2, 2021

Beyond this PR: I wonder if we should make this collection mutable. We already have TryAddWithoutValidation. If we were designing this from scratch, I think we'd have TryAdd on this collection rather than its own standalone method.

We could. I actually considered just doing that, but thought it was unnecessary scope creep, since you can do it all today, just with APIs on the original collection rather than here.

But if you think it's important or helpful, we could just make it implement ICollection<T> instead of IReadOnlyCollection<T>. It shouldn't be hard. Think I should?

I don't think there's anything here that would preclude doing that in the future, right?

I think it'd be fine. It would just mean the type ends up implementing both interfaces, but that's the case with many of our collections.

@stephentoub stephentoub force-pushed the nonvalidatedheaders branch from 55fe14b to 5068355 Compare June 2, 2021 22:02
@geoffkizer
Copy link
Contributor

Think I should?

I'm tempted to say yes, but it's probably more reasonable to just leave things as they are.

@stephentoub
Copy link
Member Author

I'm tempted to say yes, but it's probably more reasonable to just leave things as they are.

OK, I'll leave it as is.

This adds an HttpHeaders.NonValidated property, which returns a type that provides a non-validating / non-parsing / non-allocating view of headers in the collection.  Querying the resulting collection does not force parsing or validation on the contents of the headers, handing back exactly the raw data that it contains; if a header doesn't contain a raw value but instead contains an already parsed value, a string representation of that header value(s) is returned.  When using the strongly-typed members, querying and enumeration is allocation-free, unless strings need to be created to represent already parsed values.
@stephentoub stephentoub force-pushed the nonvalidatedheaders branch from 2b78666 to b2eb92d Compare June 4, 2021 21:27
@stephentoub stephentoub merged commit 86903dd into dotnet:main Jun 6, 2021
@stephentoub stephentoub deleted the nonvalidatedheaders branch June 6, 2021 21:23
@ghost ghost locked as resolved and limited conversation to collaborators Jul 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API Proposal: Non-validated HttpHeaders enumeration
4 participants