Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ASCII APIs #75012

Merged
merged 51 commits into from
Dec 21, 2022
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
af2b950
Initial ASCII methods
GrabYourPitchforks Jul 20, 2022
bd2b5f1
Add transcoding APIs
GrabYourPitchforks Jul 20, 2022
db37d32
Implement Trim
GrabYourPitchforks Jul 21, 2022
4a832cc
Split ASCII utilities into separate files
GrabYourPitchforks Jul 21, 2022
e054a01
Add ref asm
GrabYourPitchforks Jul 21, 2022
36cbfa6
Fun with case conversion!
GrabYourPitchforks Jul 22, 2022
bba61f5
Fun with case conversion!
GrabYourPitchforks Jul 22, 2022
89331d6
Updates!
GrabYourPitchforks Jul 22, 2022
8333ec8
Fix incorrect comparison
GrabYourPitchforks Jul 22, 2022
841fe3c
Fix incorrect precondition checks
GrabYourPitchforks Jul 22, 2022
dce2cae
Update main vectorized loop
GrabYourPitchforks Jul 22, 2022
685b330
Perf improvements & fix arithmetic error
GrabYourPitchforks Jul 23, 2022
4914c65
tests for Ascii.GetIndexOfFirstNonAsciiByte
adamsitnik Aug 29, 2022
7204d2d
tests for Ascii.GetIndexOfFirstNonAsciiChar
adamsitnik Aug 29, 2022
e3709b7
add tests for Ascii.IsAscii
adamsitnik Aug 29, 2022
fc6db59
add tests for Ascii.FromUtf16
adamsitnik Aug 29, 2022
e316789
add tests for Ascii.ToUtf16
adamsitnik Aug 30, 2022
a5f61b9
add tests for Ascii.Trim* and fix bug they have discovered
adamsitnik Aug 30, 2022
6516ae2
ToUpper & ToLower tests
adamsitnik Aug 30, 2022
6e1ca32
implement the missing pieces for case conversions + fix the tests
adamsitnik Aug 31, 2022
5c90223
Merge remote-tracking branch 'upstream/main' into asciiAPIs
adamsitnik Aug 31, 2022
4339af5
implement TryToLowerInPlace/TryToUpperInPlace
adamsitnik Aug 31, 2022
cc3be10
implement Ascii.StartsWith* and EndsWith* methods
adamsitnik Aug 31, 2022
ad4d90b
implement Ascii.Equals* methods
adamsitnik Sep 1, 2022
adc2f53
use self-describing names at a cost of using pragma disable ;)
adamsitnik Sep 1, 2022
aad125a
throw ArgumentException with meaningful error message
adamsitnik Sep 2, 2022
f8f98ed
rename files
adamsitnik Sep 2, 2022
2b2bcd1
Implement IndexOf and LastIndexOf using narrowing and widening
adamsitnik Sep 2, 2022
2d9105e
Implement IndexOfIgnoreCase and LastIndexOfIgnoreCase
adamsitnik Sep 2, 2022
cc603c7
refactoring
adamsitnik Sep 5, 2022
aff7e6e
IsAscii methods
adamsitnik Sep 5, 2022
bf5d709
*GetHashCode(chars)
adamsitnik Sep 5, 2022
ba1102d
*GetHashCode(bytes)
adamsitnik Sep 5, 2022
0ffd3ee
solve buffer overrun: 8 chars need to be narrowed to 8 (not 16 like b…
adamsitnik Sep 6, 2022
6f96b8b
disable the tests that are failing due to Mono bug
adamsitnik Sep 6, 2022
8ab53c2
Merge remote-tracking branch 'upstream/main' into asciiAPIs
adamsitnik Sep 6, 2022
3d5df19
fix a bug (tests that are not compiled are always passing)
adamsitnik Sep 7, 2022
518ef05
use new APIs across BCL:
adamsitnik Sep 7, 2022
ab12f70
Apply suggestions from code review
adamsitnik Sep 8, 2022
29b38f6
address code review feedback
adamsitnik Sep 8, 2022
bc48a30
Merge remote-tracking branch 'upstream/main' into asciiAPIs
adamsitnik Sep 9, 2022
1edc812
fold ASCIIUtility into Ascii, use public APIs
adamsitnik Sep 9, 2022
e2d7105
fix tests that were relying on reflection so far (and I did not know …
adamsitnik Sep 9, 2022
6106659
fix byte->char casting
adamsitnik Sep 12, 2022
0d69abd
adjust code after recent API Review
adamsitnik Dec 7, 2022
bb0a272
add missing XML docs
adamsitnik Dec 7, 2022
b1f1f07
Merge remote-tracking branch 'upstream/main' into asciiAPIs
adamsitnik Dec 7, 2022
c0f38d1
cleanup
adamsitnik Dec 7, 2022
da94353
address code review feedback
adamsitnik Dec 9, 2022
b975fe1
Merge remote-tracking branch 'upstream/main' into asciiAPIs
adamsitnik Dec 9, 2022
4483baf
Update src/libraries/System.Private.Uri/src/System/DomainNameHelper.cs
adamsitnik Dec 9, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
using Internal.Runtime.Augments;
using Internal.Runtime.CompilerHelpers;
using Internal.Runtime.CompilerServices;
using System.Text;
using System.Buffers;

namespace System.Runtime.InteropServices
{
Expand Down Expand Up @@ -502,17 +504,7 @@ public static unsafe char AnsiCharToWideChar(byte nativeValue)
internal static unsafe byte* StringToAnsiString(char* pManaged, int lenUnicode, byte* pNative, bool terminateWithNull,
bool bestFit, bool throwOnUnmappableChar)
{
bool allAscii = true;

for (int i = 0; i < lenUnicode; i++)
{
if (pManaged[i] >= 128)
{
allAscii = false;
break;
}
}

bool allAscii = Ascii.IsValid(new ReadOnlySpan<char>(pManaged, lenUnicode));
int length;

if (allAscii) // If all ASCII, map one UNICODE character to one ANSI char
Expand All @@ -530,17 +522,8 @@ public static unsafe char AnsiCharToWideChar(byte nativeValue)
}
if (allAscii) // ASCII conversion
{
byte* pDst = pNative;
char* pSrc = pManaged;

while (lenUnicode > 0)
{
unchecked
{
*pDst++ = (byte)(*pSrc++);
lenUnicode--;
}
}
OperationStatus conversionStatus = Ascii.FromUtf16(new ReadOnlySpan<char>(pManaged, length), new Span<byte>(pNative, length), out _);
Debug.Assert(conversionStatus == OperationStatus.Done);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but as a potential follow-up, if it'll be common for pNative to be non-NULL and if data suggests it'll be long enough on average, it might be worth trying a fast-path that just does FromUtf16 without first doing IsValid.

}
else // Let OS convert
{
Expand All @@ -566,26 +549,9 @@ public static unsafe char AnsiCharToWideChar(byte nativeValue)
/// </summary>
private static unsafe bool CalculateStringLength(byte* pchBuffer, out int ansiBufferLen, out int unicodeBufferLen)
{
ansiBufferLen = 0;

bool allAscii = true;

{
byte* p = pchBuffer;
byte b = *p++;

while (b != 0)
{
if (b >= 128)
{
allAscii = false;
}

ansiBufferLen++;

b = *p++;
}
}
ReadOnlySpan<byte> span = MemoryMarshal.CreateReadOnlySpanFromNullTerminated(pchBuffer);
ansiBufferLen = span.Length;
bool allAscii = Ascii.IsValid(span);

if (allAscii)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ private async Task ReadPrefixAsync()
throw new Exception("Connection stream closed while attempting to read connection preface.");
}

if (Text.Encoding.ASCII.GetString(_prefix).Contains("HTTP/1.1"))
if (_prefix.AsSpan().IndexOf("HTTP/1.1"u8) >= 0)
{
// Tests that use HttpAgnosticLoopbackServer will attempt to send an HTTP/1.1 request to an HTTP/2 server.
// This is invalid and we should terminate the connection.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,7 @@ private static string EncodeAndQuoteMime(string input)
throw new ArgumentException(SR.Format(CultureInfo.InvariantCulture,
SR.net_http_headers_invalid_value, input));
}
else if (HeaderUtilities.ContainsNonAscii(result))
else if (!Ascii.IsValid(result))
{
needsQuotes = true; // Encoded data must always be quoted, the equals signs are invalid in tokens.
result = EncodeMime(result); // =?utf-8?B?asdfasdfaesdf?=
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,20 +63,6 @@ internal static void SetQuality(UnvalidatedObjectCollection<NameValueHeaderValue
}
}

internal static bool ContainsNonAscii(string input)
{
Debug.Assert(input != null);

foreach (char c in input)
{
if ((int)c > 0x7f)
{
return true;
}
}
return false;
}

// Encode a string using RFC 5987 encoding.
// encoding'lang'PercentEncodedSpecials
internal static string Encode5987(string input)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ internal static partial class AuthenticationHelper
}
else
{
if (HeaderUtilities.ContainsNonAscii(credential.UserName))
if (!Ascii.IsValid(credential.UserName))
{
string usernameStar = HeaderUtilities.Encode5987(credential.UserName);
sb.AppendKeyValue(UsernameStar, usernameStar, includeQuotes: false);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1582,12 +1582,10 @@ private Task WriteAsciiStringAsync(string s, bool async)
int offset = _writeOffset;
if (s.Length <= _writeBuffer.Length - offset)
{
byte[] writeBuffer = _writeBuffer;
foreach (char c in s)
{
writeBuffer[offset++] = (byte)c;
}
_writeOffset = offset;
OperationStatus operationStatus = Ascii.FromUtf16(s, _writeBuffer.AsSpan(offset), out int bytesWritten);
Debug.Assert(operationStatus == OperationStatus.Done);
_writeOffset = offset + bytesWritten;

return Task.CompletedTask;
}

Expand All @@ -1598,14 +1596,14 @@ private Task WriteAsciiStringAsync(string s, bool async)

private async Task WriteStringAsyncSlow(string s, bool async)
{
if (!Ascii.IsValid(s))
{
throw new HttpRequestException(SR.net_http_request_invalid_char_encoding);
}

for (int i = 0; i < s.Length; i++)
{
char c = s[i];
if ((c & 0xFF80) != 0)
{
throw new HttpRequestException(SR.net_http_request_invalid_char_encoding);
}
await WriteByteAsync((byte)c, async).ConfigureAwait(false);
await WriteByteAsync((byte)s[i], async).ConfigureAwait(false);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,6 @@
Link="Common\System\Net\CookieFields.cs" />
<Compile Include="$(CommonPath)System\Net\CookieParser.cs"
Link="Common\System\Net\CookieParser.cs" />
<Compile Include="$(CommonPath)System\Net\CaseInsensitiveAscii.cs"
Link="Common\System\Net\CaseInsensitiveAscii.cs" />
<Compile Include="$(CommonPath)System\Net\ExceptionCheck.cs"
Link="Common\System\Net\ExceptionCheck.cs" />
<Compile Include="$(CommonPath)System\Net\HttpStatusDescription.cs"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Buffers;
using System.Collections;
using System.Diagnostics.CodeAnalysis;
using System.Security.Authentication.ExtendedProtection;
Expand Down Expand Up @@ -150,22 +151,7 @@ internal void AddPrefix(string uriPrefix)
{
throw new ArgumentException(SR.net_listener_slash, nameof(uriPrefix));
}
StringBuilder registeredPrefixBuilder = new StringBuilder();
if (uriPrefix[j] == ':')
{
registeredPrefixBuilder.Append(uriPrefix);
}
else
{
registeredPrefixBuilder.Append(uriPrefix, 0, j);
registeredPrefixBuilder.Append(i == 7 ? ":80" : ":443");
registeredPrefixBuilder.Append(uriPrefix, j, uriPrefix.Length - j);
}
for (i = 0; registeredPrefixBuilder[i] != ':'; i++)
{
registeredPrefixBuilder[i] = (char)CaseInsensitiveAscii.AsciiToLower[(byte)registeredPrefixBuilder[i]];
}
registeredPrefix = registeredPrefixBuilder.ToString();
registeredPrefix = CreateRegisteredPrefix(uriPrefix, j, i);
if (NetEventSource.Log.IsEnabled()) NetEventSource.Info(this, $"mapped uriPrefix: {uriPrefix} to registeredPrefix: {registeredPrefix}");
if (_state == State.Started)
{
Expand All @@ -179,6 +165,52 @@ internal void AddPrefix(string uriPrefix)
if (NetEventSource.Log.IsEnabled()) NetEventSource.Error(this, exception);
throw;
}

static string CreateRegisteredPrefix(string uriPrefix, int j, int i)
{
int length = uriPrefix.Length;
if (uriPrefix[j] != ':')
{
length += i == 7 ? ":80".Length : ":443".Length;
}

return string.Create(length, (uriPrefix, j, i), static (destination, state) =>
{
if (state.uriPrefix[state.j] == ':')
{
state.uriPrefix.CopyTo(destination);
}
else
{
int indexOfNextCopy = state.j;
state.uriPrefix.AsSpan(0, indexOfNextCopy).CopyTo(destination);

if (state.i == 7)
{
":80".CopyTo(destination.Slice(indexOfNextCopy));
indexOfNextCopy += 3;
}
else
{
":443".CopyTo(destination.Slice(indexOfNextCopy));
indexOfNextCopy += 4;
}

state.uriPrefix.AsSpan(state.j).CopyTo(destination.Slice(indexOfNextCopy));
}

int toLowerLength = destination.IndexOf(':');
if (toLowerLength < 0)
{
toLowerLength = destination.Length;
}

if (Ascii.ToLowerInPlace(destination.Slice(0, toLowerLength), out _) != OperationStatus.Done)
{
throw new IndexOutOfRangeException(); // backward compat for non-ASCII characters
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
}
});
}
}

internal bool ContainsPrefix(string uriPrefix) => _uriPrefixes.Contains(uriPrefix);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Text;
using System.Diagnostics;
using System.Net.Mime;

Expand Down Expand Up @@ -70,7 +71,7 @@ internal static bool TryReadReverse(string data, int index, out int outIndex, bo
return true;
}
// Check for invalid characters
else if (data[index] > MailBnfHelper.Ascii7bitMaxValue || !MailBnfHelper.Dtext[data[index]])
else if (!Ascii.IsValid(data[index]) || !MailBnfHelper.Dtext[data[index]])
{
if (throwExceptionIfFail)
{
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Text;
using System.Diagnostics;
using System.Net.Mime;

Expand Down Expand Up @@ -43,7 +44,7 @@ internal static bool TryReadReverse(string data, int index, out int outIndex, bo
// Scan for the first invalid chars (including whitespace)
for (; 0 <= index; index--)
{
if (data[index] <= MailBnfHelper.Ascii7bitMaxValue // Any Unicode allowed
if (Ascii.IsValid(data[index]) // Any ASCII allowed
&& (data[index] != MailBnfHelper.Dot && !MailBnfHelper.Atext[data[index]])) // Invalid char
{
break;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ internal static class MailBnfHelper
// characters allowed inside of comments
internal static readonly bool[] Ctext = CreateCharactersAllowedInComments();

internal const int Ascii7bitMaxValue = 127;
internal const char Quote = '\"';
internal const char Space = ' ';
internal const char Tab = '\t';
Expand Down Expand Up @@ -226,11 +225,11 @@ internal static void ValidateHeaderName(string data)
{
//if data contains Unicode and Unicode is permitted, then
//it is valid in a quoted string in a header.
if (data[offset] <= Ascii7bitMaxValue && !Qtext[data[offset]])
if (Ascii.IsValid(data[offset]) && !Qtext[data[offset]])
throw new FormatException(SR.Format(SR.MailHeaderFieldInvalidCharacter, data[offset]));
}
//not permitting Unicode, in which case Unicode is a formatting error
else if (data[offset] > Ascii7bitMaxValue || !Qtext[data[offset]])
else if (!Ascii.IsValid(data[offset]) || !Qtext[data[offset]])
{
throw new FormatException(SR.Format(SR.MailHeaderFieldInvalidCharacter, data[offset]));
}
Expand All @@ -256,7 +255,7 @@ internal static string ReadToken(string data, ref int offset)
int start = offset;
for (; offset < data.Length; offset++)
{
if (data[offset] > Ascii7bitMaxValue)
if (!Ascii.IsValid(data[offset]))
{
throw new FormatException(SR.Format(SR.MailHeaderFieldInvalidCharacter, data[offset]));
}
Expand Down Expand Up @@ -367,7 +366,7 @@ internal static void GetTokenOrQuotedString(string data, StringBuilder builder,

private static bool CheckForUnicode(char ch, bool allowUnicode)
{
if (ch < Ascii7bitMaxValue)
if (Ascii.IsValid(ch))
{
return false;
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Text;
using System.Diagnostics;
using System.Net.Mime;

Expand Down Expand Up @@ -52,7 +53,7 @@ internal static bool TryCountQuotedChars(string data, int index, bool permitUnic
}
else
{
if (!permitUnicodeEscaping && data[index] > MailBnfHelper.Ascii7bitMaxValue)
if (!permitUnicodeEscaping && !Ascii.IsValid(data[index]))
{
if (throwExceptionIfFail)
{
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Text;
using System.Diagnostics;
using System.Net.Mime;

Expand Down Expand Up @@ -185,7 +186,7 @@ internal static bool TryReadReverseUnQuoted(string data, int index, bool permitU
// non-whitespace control characters as well as all remaining ASCII chars except backslash and double quote.
private static bool IsValidQtext(bool allowUnicode, char ch)
{
if (ch > MailBnfHelper.Ascii7bitMaxValue)
if (!Ascii.IsValid(ch))
{
return allowUnicode;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ private void Initialize()
for (int i = 0; i < clientDomainRaw.Length; i++)
{
ch = clientDomainRaw[i];
if ((ushort)ch <= 0x7F)
if (Ascii.IsValid(ch))
sb.Append(ch);
}
if (sb.Length > 0)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Text;
using System.Diagnostics;
using System.Net.Mime;

Expand Down Expand Up @@ -166,7 +167,7 @@ internal static bool TryReadCfwsReverse(string data, int index, out int outIndex
}
// Check for valid characters within comments. Allow Unicode, as we won't transmit any comments.
else if (commentDepth > 0
&& (data[index] > MailBnfHelper.Ascii7bitMaxValue || MailBnfHelper.Ctext[data[index]]))
&& (!Ascii.IsValid(data[index]) || MailBnfHelper.Ctext[data[index]]))
{
index--;
}
Expand Down
Loading