Support for BFloat16s #123

maleadt · 2024-08-08T10:52:12Z

Julia 1.11 supports BFloat16 codegen by using BFloat16s.jl (JuliaLang/julia#51470), which on modern CPUs may even be hardware accelerated. For example, by AVX512BF16 on a recent Intel CPU, in combination with Julia 1.12 (as only LLVM 18 supports automatic vectorization of these operations):

julia> using BFloat16s
julia> g(x::NTuple{N,VecElement{Float32}}) where N = ntuple(i->BFloat16(x[i].value), Val(N))
julia> x = ntuple(i->VecElement(Float32(i)), 16)

julia> @code_native debuginfo=:none g(x)
	push	rbp
	mov	rbp, rsp
	mov	rax, rdi
	vcvtneps2bf16	ymm0, zmm0
	vmovups	ymmword ptr [rdi], ymm0
	pop	rbp
	vzeroupper
	ret

It'd be nice if it were possible to use SIMD.jl with BFloat16 numbers. I'm not really familiar with the code here, but I guess a starting point is adding Core.BFloat16 (available on 1.11+) to the FloatingTypes union type, possibly also relying on a BFloat16s.jl package extension for other functionality.

The text was updated successfully, but these errors were encountered:

KristofferC mentioned this issue Aug 8, 2024

WIP: try integrate BFloat16 #124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for BFloat16s #123

Support for BFloat16s #123

maleadt commented Aug 8, 2024

Support for BFloat16s #123

Support for BFloat16s #123

Comments

maleadt commented Aug 8, 2024