Expose general purpose Crc32 APIs #2036

tannergooding · 2020-01-22T21:29:44Z

Rationale

Computing a Cyclic Redundancy Check (CRC) is a common algorithm used for error detection in things like networks or storage. Additionally, modern hardware provides instruction level support for computing these values. It would be beneficial if we exposed a set of general-purpose methods which allow iterative CRC32 computation.

Proposed API

namespace System.Numerics
{
    public static class BitOperations
    {
        public static uint Crc32(uint crc, byte data);
        public static uint Crc32(uint crc, ushort data);
        public static uint Crc32(uint crc, uint data);
        public static uint Crc32(uint crc, ulong data);
    }
}

Open Questions

The output is generally treated as unsigned, however .NET considers unsigned integers (other than byte) as non-CLS compliant. It would be possible to expose both versions (those that take/return int and those that take/return uint). It would likewise be good on the input data to determine if they should be signed, unsigned, or both.

The text was updated successfully, but these errors were encountered:

Wraith2 · 2020-01-22T23:06:41Z

Is this using a specific polynomial? and if so which one? and can there be a way for the user to choose?

tannergooding · 2020-01-22T23:39:26Z

can there be a way for the user to choose?

Not if we want these accelerated by the underlying hardware.

X86 uses: 0x1EDC6F41
ARM uses: 0x04C11DB7 or 0x1EDC6F41 (depending on the instruction)

Wraith2 · 2020-01-23T00:25:38Z

So if someone is dealing with zip or something using the old polynomial they'll still need to make their own. That's a bit of a shame.

Given that at least half of the time you want to calculate a crc you will be validating incoming data it might make sense to have a ReadOnlySpan<byte> overload (that works on a fixed length, leave looping to users).

GrabYourPitchforks · 2020-01-24T07:47:23Z

If we were to expose a CRC32 API over a buffer rather than an integral type I'd probably push for it not to be on the BitOperations class.

Wraith2 · 2020-01-24T08:59:22Z

If BitOperations and these methods only exist to provide access to the hardware intrinsics I agree.
It would be useful to have a CRC32HashAlgorithm available which uses these operations if they are available.

tannergooding · 2020-01-24T15:47:54Z

If BitOperations and these methods only exist to provide access to the hardware intrinsics I agree

Right, this is namely meant to be a "cross-platform" version of the hardware intrinsics (much like the other functions on System.Numerics.BitOperations) and should allow users to easily get:

if (X86.Sse42.IsSupported) { }
else if (Arm.Crc32.IsSupported) { }
else { /* software fallback */ }

I agree that having a more general purpose class that allows polynomial customization and trivial operation over spans of data would be useful, but that isn't the immediate goal of this API.

danmoseley · 2020-07-23T19:53:26Z

If I understand right, given the caller can't choose the polynomial, and we will use the fixed CPU specific algorithm if available, this API would only be useful for callers that are (a) not interoperating with some other format that has a chosen polynomial (such as zlib), and (b) do not need to exchange the value across architectures. Is that right?

tannergooding · 2020-07-23T20:24:49Z

We likely could provide both and we might even be able to expose the API in such a way that the JIT specially handles the method if the polynomial given is constant.

But as proposed, it would be limited to the given polynomials.

danmoseley · 2020-07-23T20:39:38Z

I guess I'm wondering whether there's examples when this API with the "undefined" polynomial would be useful. But maybe that's not that different to asking when the intrinsics we already have would be useful. It would have to be when you're producing/consuming on the same machine. (Unless I misunderstand)

terrajobst · 2020-08-06T19:00:56Z

Video

Looks good as proposed
We can add overloads with custom polynomials later, but we should document the ones that we're using here.

namespace System.Numerics
{
    public static class BitOperations
    {
        public static uint Crc32(uint crc, byte data);
        public static uint Crc32(uint crc, ushort data);
        public static uint Crc32(uint crc, uint data);
        public static uint Crc32(uint crc, ulong data);
    }
}

danmoseley · 2020-08-06T19:29:00Z

I watched the video and I still need clarification on my questions I think 😄

How will we choose the polynomial?
Whatever we pick, it it will only be hardware accelerated on x64/x86 OR ARMxx, or neither, but not both, right? But it will give the same results on all CPU/platforms?
Are we targeting a specific standard (eg., zlib, or some other RFC or file format) or is this primarily for when I don't care about the polynomial because I'm both calculating the CRC and verifying the CRC in my own app?

Also, it would be good to have an example of usage, especially since this API is expected to be used to checksum an array or stream. How does that look?

tannergooding · 2020-08-06T19:37:47Z

The polynomial is 0x1EDC6F41, it is accelerated on both x86/x64 and ARM64 (I'm not sure about ARM32). ARM64 has an additional polynomial that it can also accelerate, 0x04C11DB7, which would require something like Crc32(uint crc, byte data, uint polynomial) to support. The JIT could then optimize it when polynomial is a constant of that value.

The polynomial being used is for CRC32-Castagnoli, which is fairly popular/prevalent and adopted in several standards; but which is not universaly used.

tannergooding · 2020-08-06T19:39:33Z

In particular, wikipedia lists Castagnoli as used by iSCSI, SCTP, G.hn payload, SSE4.2, Btrfs, ext4, Ceph. Arm64's other polynomial is used by ISO 3309 (HDLC), ANSI X3.66 (ADCCP), FIPS PUB 71, FED-STD-1003, ITU-T V.42, ISO/IEC/IEEE 802-3 (Ethernet), SATA, MPEG-2, PKZIP, Gzip, Bzip2, POSIX cksum,[52] PNG,[53] ZMODEM, many others, but has no built-in acceleration for x86/x64.

danmoseley · 2020-08-06T19:51:06Z

I see. That also seems to be the one used by the most popular package (CRC32C.NET)

shaggygi · 2020-08-10T12:09:10Z

Any plans to add Crc8 and Crc16 support, as well?

Used in protocol messages (https://sourceforge.net/p/bacnet/mailman/message/1259086/).

Wraith2 · 2020-08-10T13:16:24Z

Not as part of this work because I don't believe there is hardware intrinsic that provides either of those.
There does seem to be some interest in having CRC algorithms in general but I think that would need a separate api proposal and it's too late in the release cycle to get them into 5 even if approved.

Gnbrkm41 · 2020-09-09T19:33:47Z

Could I give this a try?

jkotas · 2021-01-31T18:52:15Z

That optimization only kicks in for byte and sbyte arrays. You would need to store the table as byte array.

I do not think the precomputed table makes sense for this. Computing the table on the fly for the fallback case should be just fine.

MichalPetryka · 2021-01-31T18:53:32Z

That optimization only kicks in for byte and sbyte arrays. You would need to store the table as byte array.

I do not think the precomputed table makes sense for this. Computing the table on the fly for the fallback case should be just fine.

The description of that PR says arrays of primitive literals. so I assumed that uints would also count.
EDIT: When reading the comments on that PR I saw that it only works with 1B elements due to endianness, ok then.

MichalPetryka · 2021-01-31T19:15:43Z

I've changed the table to be stored in a uint[] and added a benchmark using ??=, which seems to have only a minimal impact on performance. As we can see, all table based implementations are roughly the same, while the no table one (which I've taken from https://medium.com/@nxtchg/crc-32-without-lookup-tables-1682405aa048 and adapted into C#) is noticeably slower.

BenchmarkDotNet=v0.12.1, OS=Windows 8.1 (6.3.9600.0)
Intel Core i3-4130 CPU 3.40GHz (Haswell), 1 CPU, 4 logical and 2 physical cores
Frequency=3319351 Hz, Resolution=301.2637 ns, Timer=TSC
.NET Core SDK=5.0.102
  [Host]        : .NET Core 5.0 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT
  .NET Core 5.0 : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT

Job=.NET Core 5.0  Runtime=.NET Core 5.0

Method	Mean	Error	StdDev	Ratio	RatioSD
SSE	0.0468 ns	0.0103 ns	0.0091 ns	0.008	0.00
'Precomputed Table'	5.7565 ns	0.0564 ns	0.0528 ns	1.000	0.00
'Static constructor table'	5.7247 ns	0.0299 ns	0.0265 ns	0.995	0.01
'Lazy generated table'	5.8886 ns	0.0319 ns	0.0283 ns	1.023	0.01
'No table'	90.6089 ns	0.1864 ns	0.1653 ns	15.743	0.16

tannergooding · 2021-01-31T19:45:44Z

The ReadOnlySpan<byte> => new[] { } trick only works for byte/sbyte due to endianness.

You need to do readonly T[] = new [] { } for other primitive types (short, ushort, int, uint, long, and ulong) for it to be efficient: https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA+ABATARgLABQGAzAATakDCpA3oaQ6QA5QCWAbgIYAuM5OSUgCUYnACYB5AHYAbAJ4BlJpykAeAK6sp3AHykAKp2AyYOUgF49UmAHcA2gF1ahAJCNSABgQefvnwFUAGk8EADEsJAAhAA5iD2IgkIBRHGJIgHYPUPTEr1SAViziUJRchEp0gE4AQUr0nFCy4nzQnBwUHEoyiOqcJPSk6LKAERRKatQkyKD6RgBfAG5ZhmXmNi5efkFYcQhZOVJNbUcDIxMsC1Jreyc6AjdGLz8/MvCo2PiylLTM7LKCoolMoVGp1BpNFptDpdYJeHp9AZDWEIUbjSbTQKrRaEOZAA

MichalPetryka · 2021-01-31T19:50:18Z

Yeah, noticed after reading the commets on that PR, was my fault for not checking them.

tannergooding · 2021-03-26T19:58:35Z

This came up as part of #24328, and we probably want to use Crc32C as the name to match what will likely be exposed in the more general-purpose hashing algorithms:

namespace System.Numerics
{
    public static class BitOperations
    {
        public static uint Crc32C(uint crc, byte data);
        public static uint Crc32C(uint crc, ushort data);
        public static uint Crc32C(uint crc, uint data);
        public static uint Crc32C(uint crc, ulong data);
    }
}

CC. @bartonjs, @terrajobst, @GrabYourPitchforks could we confirm this is desired and that the above is the desired implementation shape?

terrajobst · 2021-03-26T20:25:50Z

Makes sense to us (@dotnet/fxdc)

wiz0u · 2021-08-24T08:18:44Z

Can you expose as well a method for Crc32 computation on buffers ? I don't think .NET has a public one.

For example, simply turning System.IO.Compression.Crc32Helper public would be nice.

tannergooding · 2021-08-24T17:20:53Z

@wiz0u have you seen #24328? This includes general purpose Crc32 APIs and will be in .NET 6 (under System.IO.Hashing: https://source.dot.net/#System.IO.Hashing/System/IO/Hashing/Crc32.cs,6471de2466bb4aa7).

wiz0u · 2021-08-24T17:47:22Z

@wiz0u have you seen #24328? This includes general purpose Crc32 APIs and will be in .NET 6 (under System.IO.Hashing: https://source.dot.net/#System.IO.Hashing/System/IO/Hashing/Crc32.cs,6471de2466bb4aa7).

ah great! thanks!

deeprobin · 2021-09-12T08:12:09Z

I would suggest adding another method to the proposal with the nuint parameter type.

MichalPetryka · 2021-09-13T21:31:15Z

I would suggest adding another method to the proposal with the nuint parameter type.

I wouldn't find that necessarily useful, the point of the overloads is to hash multiple bytes at once so it usually involves reading byte array values as different types, which gets a bit unclean when the type we read isn't constant size.

tannergooding · 2021-09-13T21:46:03Z

I wouldn't find that necessarily useful, the point of the overloads is to hash multiple bytes at once so it usually involves reading byte array values as different types, which gets a bit unclean when the type we read isn't constant size.

The type is a constant size for a given run (32-bits on 32-bit machines and 64-bits on 64-bit machines), as is the result (32-bits). It only makes a difference on how efficiently you can process the bits for large inputs (nuint allowing you to process a full register of bits every iteration, incrementing is still sizeof(nuint) bytes per iteration).

bartonjs · 2021-09-14T19:17:53Z

@tannergooding I can't tell if you're arguing for, against, or just providing data 😄.

        public static uint Crc32C(uint crc, nuint data);

seems like a reasonable addition on the surface of things, but won't it produce different answers on 32-bit and 64-bit executions for the same 32-bit compatible value, since a 64-bit run sees 4 extra bytes in the payload?

I'm probably leaning toward not having it and making the caller be clear if they always want 64-bit semantics or if they wanted platform native, on the assumption that nuint val = 5; return Crc32C(0, val); would give different answers on different CPUs.

MichalPetryka · 2021-09-14T19:22:13Z

@tannergooding I can't tell if you're arguing for, against, or just providing data .
        public static uint Crc32C(uint crc, nuint data);
seems like a reasonable addition on the surface of things, but won't it produce different answers on 32-bit and 64-bit executions for the same 32-bit compatible value, since a 64-bit run sees 4 extra bytes in the payload?

I'm probably leaning toward not having it and making the caller be clear if they always want 64-bit semantics or if they wanted platform native, on the assumption that nuint val = 5; return Crc32C(0, val); would give different answers on different CPUs.

It will produce different values.

tannergooding · 2021-09-14T20:39:27Z

@bartonjs, these are the primitive building blocks and so it's up the the author of a more general Crc32 loop to correctly handle any trailing bytes.\

I would expect, for example, if you have 11 bytes that you'd do: 1 or 2 iterations using nuint and then handle the remaining 3 bytes independently of the main loop, likely using the ushort or byte overloads, rather than explicitly loading up a nuint sized value. The same goes for if you were using uint or ulong directly.

The only difference between all of these is how many bytes get operated on "per iteration".

kasperk81 · 2021-11-14T14:34:50Z

inconsistency between System.Numerics.BitOperations.Crc32C(uint crc, byte data) and System.IO.Hashing.Crc32C.Hash(byte[] data) is the initial value of crc is hardcoded (0xFFFF_FFFFu) in System.IO.Hashing one. should there be a matching overload System.IO.Hashing.Crc32C.Hash(uint crc, byte[] data)?

Wraith2 · 2021-11-14T14:57:52Z

I don't think that's required. The System.IO.Hashing is a oneshot not an additive so you pass it the total data and then it computes it all which will include the final mask which is normally used to tell the difference between a hash of 0 entries and an empty hash. The intrinsic is just the underlying platform instruction being exposed for use.

bartonjs · 2021-11-14T23:06:53Z

In addition to the things @Wraith2 already pointed out, the System.IO.Hashing.Crc32 type is computing a different 32-bit CRC than this primitive. System.IO.Hashing.Crc32 computes the same CRC-32 used for Ethernet; this one is used for "CRC-32C", which is used in ext4.

https://stackoverflow.com/questions/26429360/crc32-vs-crc32c

CRC-32: https://reveng.sourceforge.io/crc-catalogue/17plus.htm#crc.cat.crc-32-iso-hdlc
CRC-32C: https://reveng.sourceforge.io/crc-catalogue/17plus.htm#crc.cat.crc-32-iscsi

deeprobin · 2022-04-06T19:39:00Z

This issue is blocked by #62692.

@tannergooding can you please remove the up-for-grabs label 😄?

Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jan 22, 2020

tannergooding added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Numerics labels Jan 22, 2020

tannergooding added api-ready-for-review and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Feb 11, 2020

tannergooding added this to the Future milestone Jun 23, 2020

carlossanlop added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-ready-for-review labels Jul 6, 2020

danmoseley mentioned this issue Aug 2, 2020

System.IO.Compression.Crc32Helper should use Crc32 intrinsics #40244

Open

terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Aug 6, 2020

tannergooding added the help wanted [up-for-grabs] Good issue for external contributors label Aug 27, 2020

zlatanov mentioned this issue Apr 28, 2021

WebSocket Compression #49304

Merged

tannergooding mentioned this issue Jun 16, 2021

Add non-cryptographic hash algorithms #53623

Merged

deeprobin mentioned this issue Nov 14, 2021

[API Implementation]: Expose general purpose Crc32 APIs #61558

Merged

10 tasks

ghost added the in-pr There is an active PR which will close this issue when it is merged label Nov 14, 2021

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Mar 12, 2022

ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 29, 2022

stephentoub mentioned this issue Sep 28, 2022

[API Proposal]: Simple one-shot hashing methods that produce the relevant type #76279

Closed

dakersnar closed this as completed in #61558 Oct 3, 2022

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 3, 2022

ghost locked as resolved and limited conversation to collaborators Nov 2, 2022

Expose general purpose Crc32 APIs #2036

Expose general purpose Crc32 APIs #2036

Comments

tannergooding commented Jan 22, 2020

Rationale

Proposed API

Open Questions

Wraith2 commented Jan 22, 2020

tannergooding commented Jan 22, 2020

Wraith2 commented Jan 23, 2020

GrabYourPitchforks commented Jan 24, 2020

Wraith2 commented Jan 24, 2020

tannergooding commented Jan 24, 2020

danmoseley commented Jul 23, 2020

tannergooding commented Jul 23, 2020

danmoseley commented Jul 23, 2020

terrajobst commented Aug 6, 2020 • edited by dotnet-api-review bot Loading

danmoseley commented Aug 6, 2020 • edited Loading

tannergooding commented Aug 6, 2020

tannergooding commented Aug 6, 2020

danmoseley commented Aug 6, 2020

shaggygi commented Aug 10, 2020 • edited Loading

Wraith2 commented Aug 10, 2020

Gnbrkm41 commented Sep 9, 2020

jkotas commented Jan 31, 2021

MichalPetryka commented Jan 31, 2021 • edited Loading

MichalPetryka commented Jan 31, 2021

tannergooding commented Jan 31, 2021 • edited Loading

MichalPetryka commented Jan 31, 2021

tannergooding commented Mar 26, 2021

terrajobst commented Mar 26, 2021

wiz0u commented Aug 24, 2021

tannergooding commented Aug 24, 2021

wiz0u commented Aug 24, 2021

deeprobin commented Sep 12, 2021

MichalPetryka commented Sep 13, 2021

tannergooding commented Sep 13, 2021 • edited Loading

bartonjs commented Sep 14, 2021

MichalPetryka commented Sep 14, 2021

tannergooding commented Sep 14, 2021

kasperk81 commented Nov 14, 2021

Wraith2 commented Nov 14, 2021

bartonjs commented Nov 14, 2021

deeprobin commented Apr 6, 2022

terrajobst commented Aug 6, 2020 •

edited by dotnet-api-review bot

Loading

danmoseley commented Aug 6, 2020 •

edited

Loading

shaggygi commented Aug 10, 2020 •

edited

Loading

MichalPetryka commented Jan 31, 2021 •

edited

Loading

tannergooding commented Jan 31, 2021 •

edited

Loading

tannergooding commented Sep 13, 2021 •

edited

Loading