Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for BFloat16s #123

Open
maleadt opened this issue Aug 8, 2024 · 0 comments
Open

Support for BFloat16s #123

maleadt opened this issue Aug 8, 2024 · 0 comments

Comments

@maleadt
Copy link
Contributor

maleadt commented Aug 8, 2024

Julia 1.11 supports BFloat16 codegen by using BFloat16s.jl (JuliaLang/julia#51470), which on modern CPUs may even be hardware accelerated. For example, by AVX512BF16 on a recent Intel CPU, in combination with Julia 1.12 (as only LLVM 18 supports automatic vectorization of these operations):

julia> using BFloat16s
julia> g(x::NTuple{N,VecElement{Float32}}) where N = ntuple(i->BFloat16(x[i].value), Val(N))
julia> x = ntuple(i->VecElement(Float32(i)), 16)

julia> @code_native debuginfo=:none g(x)
	push	rbp
	mov	rbp, rsp
	mov	rax, rdi
	vcvtneps2bf16	ymm0, zmm0
	vmovups	ymmword ptr [rdi], ymm0
	pop	rbp
	vzeroupper
	ret

It'd be nice if it were possible to use SIMD.jl with BFloat16 numbers. I'm not really familiar with the code here, but I guess a starting point is adding Core.BFloat16 (available on 1.11+) to the FloatingTypes union type, possibly also relying on a BFloat16s.jl package extension for other functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant