Implementation feedback: x86/x64 lacks float64 <> float16 instructions #12

syg · 2024-03-28T22:30:23Z

Currently, Math.f16round and assigning into Float16Array does a conversion from float64 to float16 using the roundTiesToEven mode.

On x86/x64, as far as V8 can tell there are no generally available instructions to directly narrow float64 to float16. There are two extensions that supports float16. F16C, an SSE extension, which has been available since 2007. And AVX512-FP16, which is much newer and is currently only available on the latest Sapphire Rapids Xeon chips. ~~Both extensions only support converting float32 to float16.~~ F16C only supports converting float32 to float16. AVX512-FP16 does support float64 to float16 but is only on newest Xeons.

(On ARM such instructions exist, this is an x86-only issue.)

Notably, using roundTiesToEven, narrowing float64 to float16 is not equivalent to narrowing float64 to float32 to float16. This means that on x86/x64, there is no efficient codegen possible for float16 rounding in the foreseeable future. There is only efficient codegen possible by first converting to float32, e.g. Math.f16round(Math.fround(n)) and f16a[idx] = Math.fround(n).

The main problem is that the obvious code you want to write is slow because it requires software emulation.

There are alternatives, but each one is kind of unsavory in its own way.

We could spec float64 to float16 conversion directly as float64 to float32 to float16, under roundTiesToEven. This is unsavory because Math.f16round sure doesn't sound like it should be doing an intermediate narrowing first.
We could spec float64 to float16 conversion as using roundTiesToZero, in which case it's unobservable if you first round to a float32 intermediate. This is unsavory because all other floating point operations in JS use roundTiesToEven.
We could add a new f16via32round and Float16ViaFloat32Array (names up to bikeshedding) that do an intermediate round.

For comparison, this is probably less of an issue for other languages that are adding float16 because float32s are already a primitive type in those languages. In C++, this would only come up if you're actually converting doubles to halfs. Similar for Wasm.

But for JS, since the only number type we have is double, it'll end up being a very common operation whether intentional or not. I think there's also a question whether real world code tend to go from float64 to float16 directly, or they're mostly going from float32 (that is, maybe x86 doesn't have instructions for it because there's no demand).

Thoughts from other implementers @dminor @msaboff?

h/t @SPY

The text was updated successfully, but these errors were encountered:

bakkot · 2024-03-28T22:52:32Z

Swift also added float16 recently. You maybe interested in their implementation of this logic, which does a cast to half and then handles the rounding edge cases explicitly (via some bit-twiddling I do not understand).

anba · 2024-03-29T09:40:13Z

And AVX512-FP16, which is much newer and is currently only available on the latest Sapphire Rapids Xeon chips. Both extensions only support converting float32 to float16.

vcvtsh2sd and vcvtsd2sh from AVX512-FP16 support direct float64<>float16, but as you've noted, this extension is currently only available on Sapphire Rapids processors. (float16 to float64 conversion doesn't require AVX512-FP16 when implemented through vcvtph2ps followed by vcvtss2sd, which means it only requires support for F16C.)

(On ARM such instructions exist, this is an x86-only issue.)

More detailed:

ARM64 supports float64<>float16 through fcvt.
ARM32 with Neon supports float16 to float64 through vcvtb.f32.f16 followed by vcvt.f64.f32. float64 to float16 requires software emulation.

syg · 2024-03-29T15:17:41Z

vcvtsh2sd and vcvtsd2sh from AVX512-FP16 support direct float64<>float16,

Aren't those for integers?

anba · 2024-03-29T15:38:00Z

Aren't those for integers?

No, I don't think so. Clang/GCC also emit both instructions when using -mavx512fp16: https://godbolt.org/z/4d9M4fMc9

syg · 2024-03-29T16:01:06Z

Thanks for checking, corrected the OP.

syg · 2024-04-01T18:01:02Z

After sitting on this for a while, I think my main concern is mainly that, unlike in other languages that have float as a primitive type, in JS it will be too easy to do the slow thing of assigning directly into a Float16Array.

@bakkot What are your thoughts on having Float16Via32Array (hopefully with a less horrible name)? We have precedence for TAs with special conversion rules in Uint8Clamped.

ljharb · 2024-04-01T18:05:09Z

The weirdness of Uint8Clamped has come up a number of times, and is a big part of the pushback to https://github.com/tc39/proposal-dataview-get-set-uint8clamped - that seems like an undesirable precedent to follow.

syg · 2024-04-01T18:13:39Z

The weirdness of Uint8Clamped has come up a number of times, and is a big part of the pushback to https://github.com/tc39/proposal-dataview-get-set-uint8clamped - that seems like an undesirable precedent to follow.

It'll be as unmotiviated to add to DataView as Uint8Clamped, but why is that undesirable? That seems like a fairly small thing to me, that DataView isn't 1:1 with TA constructors.

ljharb · 2024-04-01T18:19:55Z

I meant that the general feeling has consistently been "Uint8Clamped was a mistake and it'd be better if it didn't exist".

syg · 2024-04-01T18:27:27Z

I meant that the general feeling has consistently been "Uint8Clamped was a mistake and it'd be better if it didn't exist".

Could you refresh my memory on this feeling? My recollection of the discussion around the DataView proposal is that it was primarily about that a getUint8Clamped getter is misleading in that clamping is a conversion behavior that happens on set, and there isn't such a thing as a Uint8Clamped as a integer type. I don't make the connection back to that "Uint8ClampedArray itself considered harmful".

ljharb · 2024-04-01T18:29:29Z

Sorry if I was unclear. You're correct about the implications for my proposal that were discussed - but the sentiment I've repeatedly heard around any Typed Array discussion in committee, including that one, is that Uint8ClampedArray is "weird" and shouldn't be there.

syg · 2024-04-01T18:35:36Z

I see. I don't know if it's "weird" per se, but Uint8ClampedArray has a narrower use case than the other TA constructors. But so does Float16Array, and I'd like the common use patterns to not be slow. Like, the main motivation of Float16Array is we're making a TA type for a better WebGPU and similar kinds of APIs. This is very much like Uint8ClampedArray, which exists to better processing APIs that want to get bytes that are known to not have underflowed/overflowed.

bakkot · 2024-04-02T00:29:16Z

Hm. Since this is a pretty specialized use, it would not be the worst thing to have such an ugly operation, but I'd want to know how bad the penalty of doing it in software actually is. The approach Swift takes looks like it should be cheap almost all of the time (i.e. it's pretty rare that you would actually take the branch).

I suspect a lot of applications would prefer to pay the performance penalty in exchange for having the behavior match (say) Pytorch or C++.

Would we expose both? If so I'd also want to expose whether the hardware supports Float16 casts, so that you could choose that option on hardware which supported it.

syg · 2024-04-02T14:33:25Z

I suspect a lot of applications would prefer to pay the performance penalty in exchange for having the behavior match (say) Pytorch or C++.

This is where my above question on how people are using float16 comes from. It is true other languages let you go from float64 -> float16, but are most people actually going through float32 anyway, since their source is actually in float32? If so that means today to match that behavior they'd be calling a lot of Math.frounds.

Would we expose both? If so I'd also want to expose whether the hardware supports Float16 casts, so that you could choose that option on hardware which supported it.

Yes, definitely both.

anba · 2024-07-29T18:37:07Z

Implementing the technique linked in #12 (comment) reduces the overhead for Float64 -> Float16 conversion to ~50% when compared to using Math.fround for Float64 -> Float32 -> Float16 in local µ-benchmarks.

Relevant SpiderMonkey bugs:

https://bugzilla.mozilla.org/show_bug.cgi?id=1835034 for the initial Float16Array JIT support.
https://bugzilla.mozilla.org/show_bug.cgi?id=1910423 for the Float64 -> Float16 conversion through Float32. (Still unreviewed, so there could be some bugs lurking.)

Generated x86 assembly code to convert Float64 -> Float16:

cvtsd2ss   %xmm1, %xmm0         # Convert Float64 -> Float32.
vmovd      %xmm0, %r12d         # Get bit representation of Float32 value.
andl       $0x7fffffff, %r12d   # Mask off sign bit.
cmpl       $0x33000000, %r12d   # Underflow to zero.
jb         .L1
cmpl       $0x47800000, %r12d   # Overflow to infinity.
jae        .L1
cmpl       $0x38800000, %r12d   # Detection of subnormal numbers.
setae      %bl
movzbl     %bl, %ebx
shll       $12, %ebx
andl       $0x1fff, %r12d
cmpl       %ebx, %r12d          # Check round and sticky bits.
jne        .L1
movdqa     .L2(%rip), %xmm15    # Load [0, 0, 0, 1].
psignd     %xmm1, %xmm15
paddd      %xmm15, %xmm0        # Adjust mantissa by -1/0/+1.
.set .L1
vcvtps2ph  $0x4, %xmm0, %xmm0   # Convert Float32 -> Float16.

This looks okay-ish to me.

anba · 2024-07-29T18:41:53Z

From what I've read, it seems like Zen 6 will support AVX512-FP16. And for future Intel CPUs, it seems like AVX10 will provide FP16 support.

Constellation · 2024-08-12T16:26:16Z

@anba 's shown code looks simple and looks like the future Intel CPUs will eventually get native support for double -> FP16. So, mildly, I prefer to the current proposal.

bakkot mentioned this issue Mar 29, 2024

Proposal: FP16 value type and operations WebAssembly/design#1497

Open

SPY mentioned this issue Apr 2, 2024

f16x8.demote_f64x2_zero WebAssembly/half-precision#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation feedback: x86/x64 lacks float64 <> float16 instructions #12

Implementation feedback: x86/x64 lacks float64 <> float16 instructions #12

syg commented Mar 28, 2024 •

edited

Loading

bakkot commented Mar 28, 2024 •

edited

Loading

anba commented Mar 29, 2024

syg commented Mar 29, 2024

anba commented Mar 29, 2024

syg commented Mar 29, 2024

syg commented Apr 1, 2024

ljharb commented Apr 1, 2024

syg commented Apr 1, 2024

ljharb commented Apr 1, 2024

syg commented Apr 1, 2024 •

edited

Loading

ljharb commented Apr 1, 2024

syg commented Apr 1, 2024

bakkot commented Apr 2, 2024

syg commented Apr 2, 2024

anba commented Jul 29, 2024

anba commented Jul 29, 2024

Constellation commented Aug 12, 2024

Implementation feedback: x86/x64 lacks float64 <> float16 instructions #12

Implementation feedback: x86/x64 lacks float64 <> float16 instructions #12

Comments

syg commented Mar 28, 2024 • edited Loading

bakkot commented Mar 28, 2024 • edited Loading

anba commented Mar 29, 2024

syg commented Mar 29, 2024

anba commented Mar 29, 2024

syg commented Mar 29, 2024

syg commented Apr 1, 2024

ljharb commented Apr 1, 2024

syg commented Apr 1, 2024

ljharb commented Apr 1, 2024

syg commented Apr 1, 2024 • edited Loading

ljharb commented Apr 1, 2024

syg commented Apr 1, 2024

bakkot commented Apr 2, 2024

syg commented Apr 2, 2024

anba commented Jul 29, 2024

anba commented Jul 29, 2024

Constellation commented Aug 12, 2024

syg commented Mar 28, 2024 •

edited

Loading

bakkot commented Mar 28, 2024 •

edited

Loading

syg commented Apr 1, 2024 •

edited

Loading