-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float16 type #3467
Comments
LLVM 3.1 added support for half floats, so this should be doable. Marking as 'up for grabs'. |
Since this is strictly a storage type, very few operations are needed – mostly conversion to and from larger float types. |
@rwgardner, my guess is that this will happen sooner if you submit a pull request. ("Up for grabs" is a good choice here, and it basically means "waiting for someone to do it." Since you want the feature...) It's good that you first submitted it as an issue, however, in case there were strong objections; since that doesn't seem to be the case, it looks like the way is clear for you to add this feature. Some time in the not-too-distant past, support for |
Float16 should be substantially easier than Int128. Up for grabs is more like "waiting for someone to do it and pretty nicely isolated and doable by a determined newcomer." |
The cool thing about I believe a first cut implementation can be done by leveraging bitshifts and such the way @rwgardner has already done, and it would be nice to receive that as a pull request as a starting point. |
Sounds good. I'm not "grabbing" this yet, but I will if I really want it done. (Unfortunately, I don't get paid to work on Julia for the most part, which means I need to do this in my free time. That's something I'd love to do, but in short, a new first baby due any day has been and will be dominating that for a while.) |
Is it possible for you to isolate the code that you have already written for |
Outline of what needs to be done:
@JeffBezanson, any thoughts on whether it's better to add new specific intrinsics ( |
If rwgardner is alright with it, I can try implementing |
@mattgallivan Please jump in. More the merrier. @StefanKarpinski 's outline is basically what needs to be done, and one can follow the |
Just to expand on what I mean by "generalizing the existing ones", this means turning the |
The Int stuff already does that and it would be nice to do so with FloatingPoint too. I wonder if we should take this opportunity to also add Float128 at the same time, assuming LLVM supports it. |
Since there is no hardware support for quad-precision arithmetic, adding Float128, is quite a bit more complicated. |
Yeah, that's a whole different can of worms. You actually want to compute with Float128 or it's completely useless. For Float16, it's fine to just be able to store them. |
@mattgallivan all sounds good. I would love to contribute and would have a lot of fun doing it, but my life is about as insane as it's ever been right now. Hopefully I can contribute in other ways in the future. You may not want this (I'm sure it could be written more efficiently, etc., and you may want to do it in fortran or C), but here's what I have. It also hasn't been heavily validated yet, but you might use it for validation by comparing it to your code. I haven't done any conversion back to Float16.
We could convert to only Float32 or Float64 and then use existing code to convert between those. It seems more efficient to convert to/from both directly in most cases, but it may not be on some architectures, partly depending on whether there is hardware support for converting between Float32 and Float64. (I don't know if that's something floating point units typically support or not.) |
@StefanKarpinski Would it be good to start off with this as a pure julia implementation and get it in base to begin with? |
Until the LLVM bug is sorted out, it may be worthwhile to put @rwgardner 's julia implementation in Base. That way, at least the storage format can be used, and the conversions could be potentially faster when the LLVM issue is fixed. @loladiro Does LLVM 3.3 fix the Float16 bugs? |
Even using @rwgardner's conversions, the following patch unfortunately still causes LLVM failures: https://gist.github.com/StefanKarpinski/9092d04bc24c44493d08 julia> float16(1.5)
LLVM ERROR: Cannot select: 0x104151b10: ch = store 0x102070910, 0x10421df10, 0x104231d10, 0x10434d410<ST2[%14]> [ORD=77165] [ID=35]
0x10421df10: f16,ch = load 0x10434dc10, 0x102070010, 0x10434d410<LD2[FixedStack0]> [ORD=77156] [ID=27]
0x102070010: i64 = FrameIndex<0> [ORD=77155] [ID=4]
0x10434d410: i64 = undef [ORD=77150] [ID=2]
0x104231d10: i64 = add 0x104233910, 0x1041a7810 [ORD=77163] [ID=33]
0x104233910: i64,ch,glue = CopyFromReg 0x104087a10, 0x104088010, 0x104087a10:1 [ORD=77157] [ID=32]
0x104088010: i64 = Register %RAX [ORD=77157] [ID=10]
0x104087a10: ch,glue = callseq_end 0x10434da10, 0x104264310, 0x104264310, 0x10434da10:1 [ORD=77157] [ID=31]
0x104264310: i64 = TargetConstant<0> [ORD=77155] [ID=5]
0x104264310: i64 = TargetConstant<0> [ORD=77155] [ID=5]
0x10434da10: ch,glue = X86ISD::CALL 0x104279410, 0x104232910, 0x104085410, 0x10417a710, 0x104279410:1 [ORD=77157] [ID=30]
0x104232910: i64 = X86ISD::Wrapper 0x104085310 [ID=16] |
You'll still want to leave in the disable in the compiler, otherwise LLVM will generate bad code. LLVM 3.3 does not fix this. |
Yes, with this implementation no compiler changes are needed; it's just a 16-bit bitstype. |
Ok, if someone wants to finish this, I'm away for the day. |
Bump. |
@StefanKarpinski do you just want to apply your patch? |
I don't think just applying the patch works. There was a bunch of changes it needed to work. |
It would be nice to have a nicer show() method for float16. Asking the question here in case this was done by design.
|
Printing 16-bit floats correctly and minimally is quite non-trivial. Our 32-bit and 64-bit float printing are handled by the double-conversion library which does not support 16-bit floats. It might be possible to figure out a hack that approximates correct minimal Float16 printing using the printing routines for Float32, but it's not obvious how. |
I wonder what is going on here:
|
(plus
|
This is a request for support for half-precision floating point numbers (Float16s).
(If there has been any discussion about adding support for these, which I would expect there was, I did not find it.)
Although the precision is low, Float16s are still useful when you have a very large quantity of floating point numbers (which is what we have) and want to reduce memory footprint, cache impact, or disk storage. (Currently, we manually convert our half precision floats with bit manipulations and reinterpretation, but the code would be cleaner if Julia supported them natively.)
Thanks.
The text was updated successfully, but these errors were encountered: