Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type inference issue with vectors if ints and floats in julia 1.10 #526

Closed
NilsNiggemann opened this issue Jan 18, 2024 · 3 comments · Fixed by JuliaSIMD/VectorizationBase.jl#109
Labels
good first issue Good for newcomers

Comments

@NilsNiggemann
Copy link

Apparently, multiplying a vector of floats with a vector of ints causes problems with the julia compiler. The return type res cannot be inferred anymore (red::Any) and this causes performance losses.
I have tested this on Julia 1.9.3, and there the example below works fine there, but not on Julia 1.10.
Of course, one can manually promote Integers to Floats before multiplying, at least as a workaround.

Minimum working example:

using LoopVectorization
function LVTest(a1,a2)
    res = zero(eltype(a1))
    @turbo for i in eachindex(a1,a2)
        res += a1[i]*a2[i]
    end
    return res
end

aFloat = zeros(10)
aInt = zeros(Int,10)

@code_warntype LVTest(aFloat,aInt) #prints type Any for res on Julia 1.10 and does not show any type instabilities for 1.9

function checkAllocs()
    aFloat = zeros(10)
    aInt = zeros(Int,10)
    LVTest(aFloat,aInt) # compile
    println("Allocations: ",@allocated LVTest(aFloat,aInt))
end

checkAllocs() # prints 0 on Julia 1.9 but 2304 on Julia 1.10

The output of versioninfo():

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, haswell)
  Threads: 1 on 8 virtual cores
Environment:
  JULIA_PKG_USE_CLI_GIT = true
  JULIA_DEPOT_PATH = /storage/niggeni/.julia_hexagon
  JULIA_IMAGE_THREADS = 1
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, haswell)
  Threads: 1 on 8 virtual cores
Environment:
  JULIA_PKG_USE_CLI_GIT = true
  JULIA_DEPOT_PATH = /storage/niggeni/.julia_hexagon
@chriselrod
Copy link
Member

chriselrod commented Jan 18, 2024

Base.promote seems to have a problem:

julia> using VectorizationBase

julia> vxi = Vec(ntuple(identity, 4)...);

julia> vxf = Vec(ntuple(float, 4)...);

julia> vxiu = VecUnroll((vxi,vxi,vxi,vxi));

julia> vxfu = VecUnroll((vxf,vxf,vxf,vxf));

julia> @code_warntype VectorizationBase.promote(vxiu,vxfu,vxfu)
MethodInstance for promote(::VecUnroll{3, 4, Int64, Vec{4, Int64}}, ::VecUnroll{3, 4, Float64, Vec{4, Float64}}, ::VecUnroll{3, 4, Float64, Vec{4, Float64}})
  from promote(x, y, z) @ Base promotion.jl:397
Arguments
  #self#::Core.Const(promote)
  x::VecUnroll{3, 4, Int64, Vec{4, Int64}}
  y::VecUnroll{3, 4, Float64, Vec{4, Float64}}
  z::VecUnroll{3, 4, Float64, Vec{4, Float64}}
Locals
  @_5::Int64
  pz::VecUnroll{3, 4, Float64, Vec{4, Float64}}
  py::VecUnroll{3, 4, Float64, Vec{4, Float64}}
  px::Any
Body::Tuple{Any, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}
1nothing%2  = Base._promote(x, y, z)::Tuple{Any, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}%3  = Base.indexed_iterate(%2, 1)::Core.PartialStruct(Tuple{Any, Int64}, Any[Any, Core.Const(2)])
│         (px = Core.getfield(%3, 1))
│         (@_5 = Core.getfield(%3, 2))
│   %6  = Base.indexed_iterate(%2, 2, @_5::Core.Const(2))::Core.PartialStruct(Tuple{VecUnroll{3, 4, Float64, Vec{4, Float64}}, Int64}, Any[VecUnroll{3, 4, Float64, Vec{4, Float64}}, Core.Const(3)])
│         (py = Core.getfield(%6, 1))
│         (@_5 = Core.getfield(%6, 2))
│   %9  = Base.indexed_iterate(%2, 3, @_5::Core.Const(3))::Core.PartialStruct(Tuple{VecUnroll{3, 4, Float64, Vec{4, Float64}}, Int64}, Any[VecUnroll{3, 4, Float64, Vec{4, Float64}}, Core.Const(4)])
│         (pz = Core.getfield(%9, 1))
│   %11 = Core.tuple(x, y, z)::Tuple{VecUnroll{3, 4, Int64, Vec{4, Int64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}%12 = Core.tuple(px, py, pz)::Tuple{Any, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}
│         Base.not_sametype(%11, %12)
│   %14 = Core.tuple(px, py, pz)::Tuple{Any, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}
└──       return %14

Note that Base._promote is inferred as returning ::Tuple{Any,...}. Yet, @code_warntype Base._promote is fine:

julia> @code_warntype Base._promote(vxiu,vxfu,vxfu)
MethodInstance for Base._promote(::VecUnroll{3, 4, Int64, Vec{4, Int64}}, ::VecUnroll{3, 4, Float64, Vec{4, Float64}}, ::VecUnroll{3, 4, Float64, Vec{4, Float64}})
  from _promote(x, y, z) @ Base promotion.jl:374
Arguments
  #self#::Core.Const(Base._promote)
  x::VecUnroll{3, 4, Int64, Vec{4, Int64}}
  y::VecUnroll{3, 4, Float64, Vec{4, Float64}}
  z::VecUnroll{3, 4, Float64, Vec{4, Float64}}
Locals
  R::Type{VecUnroll{3, 4, Float64, Vec{4, Float64}}}
Body::Tuple{VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}
1nothing
│        (R = Base.promote_typeof(x, y, z))
│   %3 = Base.convert(R::Core.Const(VecUnroll{3, 4, Float64, Vec{4, Float64}}), x)::VecUnroll{3, 4, Float64, Vec{4, Float64}}%4 = Base.convert(R::Core.Const(VecUnroll{3, 4, Float64, Vec{4, Float64}}), y)::VecUnroll{3, 4, Float64, Vec{4, Float64}}%5 = Base.convert(R::Core.Const(VecUnroll{3, 4, Float64, Vec{4, Float64}}), z)::VecUnroll{3, 4, Float64, Vec{4, Float64}}%6 = Core.tuple(%3, %4, %5)::Tuple{VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}, VecUnroll{3, 4, Float64, Vec{4, Float64}}}
└──      return %6

Julia type inference being a PITA as usual.

@chriselrod chriselrod added the good first issue Good for newcomers label Jan 18, 2024
@chriselrod
Copy link
Member

I marked this as a good first issue.
I'd suggest deving VectorizationBase, and playing with some promote definitions to try and get it to infer.

Then also add tests that it does infer correctly.

@NilsNiggemann
Copy link
Author

I tried to do this, but it still seems somewhat complicated:
To keep it simple, I first tried whether straight forwardly fixing your promotion example also fixes the issue.
The type inference of Base.promote is actually resolved by basically copy-pasting code from Base insrc/promotion.jl (But why though?)

"""copy-pasted code from Base._promote"""
@inline function __Base_stolen_promote(x,y,z)
  R = Base.promote_typeof(x, y, z)
  return (convert(R, x), convert(R, y), convert(R, z))
end

"""copy-pasted code from Base.promote, inserted __Base_stolen_promote"""
@inline function Base.promote(
  x::VecUnroll,
  y::VecUnroll,
  z::VecUnroll
)
  px, py, pz = Base._promote(x, y, z) # this is type unstable
  px, py, pz = __Base_stolen_promote(x, y, z) # this infers correctly

  Base.not_sametype((x,y,z), (px,py,pz))
  px, py, pz
end

If I run your example from above

promotion
using VectorizationBase

vxi = Vec(ntuple(identity, 4)...);

vxf = Vec(ntuple(float, 4)...);

vxiu = VecUnroll((vxi,vxi,vxi,vxi));

vxfu = VecUnroll((vxf,vxf,vxf,vxf));

@code_warntype VectorizationBase.promote(vxiu,vxfu,vxfu)

I see that this does indeed remove the instability. (Of course this would not be the PR, since I am using unexported Base functions but for now its worth to try)

However, my function LVTest() still does not infer correctly

code_warntype output

MethodInstance for LVTest(::Vector{Float64}, ::Vector{Int64})
  from LVTest(a1, a2) @ Main /storage/niggeni/.julia_hexagon/dev/VectorizationBase/promotiontest/promotion.jl:31
Arguments
  #self#::Core.Const(LVTest)
  a1::Vector{Float64}
  a2::Vector{Int64}
Locals
  @_4::Union{}
  val@_5::Union{}
  @_6::Int64
  @_7::Int64
  val@_8::Base.OneTo{Int64}
  res::Any
  res_##onevec##::Any
  Wvecwidth##::StaticInt{4}
  Tloopeltype##::Type{Float64}
  vargsym#332::Tuple{Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}}
  ##grouped#strided#pointer####7###::LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}
  preserve#buffer#@_15::Vector{Int64}
  vptr##_a2::LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}
  preserve#buffer#@_17::Vector{Float64}
  vptr##_a1::LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}
  #i_loop_upper_bound###4###::Int64
  #i_loop_lower_bound###3###::StaticInt{1}
  #loopleni###2###::Int64
  #looprangei###1###::Static.OptionallyStaticUnitRange{StaticInt{1}, Int64}
  msg::Union{}
  kwargs::Union{}
  line::Union{}
  file::Union{}
  id::Union{}
  logger::Union{}
  _module::Union{}
  group::Union{}
  std_level::Union{}
  level::Union{}
  err::Union{}
  i::Union{}
  @_35::Union{}
Body::Any
1 ─        Core.NewvarNode(:(@_4))
│          Core.NewvarNode(:(val@_5))
│          Core.NewvarNode(:(@_6))
│          Core.NewvarNode(:(@_7))
│          Core.NewvarNode(:(res_##onevec##))
│          Core.NewvarNode(:(Wvecwidth##))
│          Core.NewvarNode(:(Tloopeltype##))
│          Core.NewvarNode(:(vargsym#332))
│          Core.NewvarNode(:(##grouped#strided#pointer####7###))
│          Core.NewvarNode(:(preserve#buffer#@_15))
│          Core.NewvarNode(:(vptr##_a2))
│          Core.NewvarNode(:(preserve#buffer#@_17))
│          Core.NewvarNode(:(vptr##_a1))
│   %14  = Main.eltype(a1)::Core.Const(Float64)
│          (res = Main.zero(%14))
│          Main.nothing
│          Main.nothing
│          Main.nothing
│          nothing
│          (val@_8 = Main.eachindex(a1, a2))
│          nothing
│   %22  = val@_8::Base.OneTo{Int64}
│          (#looprangei###1### = LoopVectorization.canonicalize_range(%22))
│          (#loopleni###2### = StaticArrayInterface.static_length(#looprangei###1###))
│          (#i_loop_lower_bound###3### = LoopVectorization.maybestaticfirst(#looprangei###1###))
│          (#i_loop_upper_bound###4### = LoopVectorization.maybestaticlast(#looprangei###1###))
│   %27  = Main.typeof(res::Core.Const(0.0))::Core.Const(Float64)
│   %28  = LoopVectorization.check_args(a1, a2, %27)::Core.Const(true)
└──        goto #5 if not %28
2 ─        goto #5 if not true
3 ─ %31  = (LoopVectorization.can_turbo)(LoopVectorization.vfmadd_fast, Val{3}())::Core.Const(true)
└──        goto #5 if not %31
4 ─ %33  = LoopVectorization.stridedpointer_preserve(a1)::Tuple{LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}, Vector{Float64}}
│   %34  = Base.indexed_iterate(%33, 1)::Core.PartialStruct(Tuple{LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}, Int64}, Any[LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}, Core.Const(2)])
│          (vptr##_a1 = Core.getfield(%34, 1))
│          (@_7 = Core.getfield(%34, 2))
│   %37  = Base.indexed_iterate(%33, 2, @_7::Core.Const(2))::Core.PartialStruct(Tuple{Vector{Float64}, Int64}, Any[Vector{Float64}, Core.Const(3)])
│          (preserve#buffer#@_17 = Core.getfield(%37, 1))
│   %39  = LoopVectorization.stridedpointer_preserve(a2)::Tuple{LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}, Vector{Int64}}
│   %40  = Base.indexed_iterate(%39, 1)::Core.PartialStruct(Tuple{LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}, Int64}, Any[LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}, Core.Const(2)])
│          (vptr##_a2 = Core.getfield(%40, 1))
│          (@_6 = Core.getfield(%40, 2))
│   %43  = Base.indexed_iterate(%39, 2, @_6::Core.Const(2))::Core.PartialStruct(Tuple{Vector{Int64}, Int64}, Any[Vector{Int64}, Core.Const(3)])
│          (preserve#buffer#@_15 = Core.getfield(%43, 1))
│   %45  = vptr##_a1::LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}
│   %46  = Core.tuple(#i_loop_lower_bound###3###)::Core.Const((static(1),))
│   %47  = LoopVectorization.gespf1(%45, %46)::LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}}
│   %48  = LoopVectorization.densewrapper(%47, a1)::LayoutPointers.DensePointerWrapper{(true,), Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}, LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}}}
│   %49  = vptr##_a2::LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{1}}}
│   %50  = Core.tuple(#i_loop_lower_bound###3###)::Core.Const((static(1),))
│   %51  = LoopVectorization.gespf1(%49, %50)::LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}}
│   %52  = LoopVectorization.densewrapper(%51, a2)::LayoutPointers.DensePointerWrapper{(true,), Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}, LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}}}
│   %53  = Core.tuple(%48, %52)::Tuple{LayoutPointers.DensePointerWrapper{(true,), Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}, LayoutPointers.StridedPointer{Float64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}}}, LayoutPointers.DensePointerWrapper{(true,), Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}, LayoutPointers.StridedPointer{Int64, 1, 1, 0, (1,), Tuple{StaticInt{8}}, Tuple{StaticInt{0}}}}}
│   %54  = Main.Val::Core.Const(Val)
│   %55  = ()::Core.Const(())
│   %56  = Core.apply_type(%54, %55)::Core.Const(Val{()})
│   %57  = (%56)()::Core.Const(Val{()}())
│   %58  = LoopVectorization.grouped_strided_pointer(%53, %57)::Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, Tuple{Nothing, Nothing}}
│          (##grouped#strided#pointer####7### = (getfield)(%58, 1))
│   %60  = $(Expr(:gc_preserve_begin, :(preserve#buffer#@_17), :(preserve#buffer#@_15)))
│   %61  = LoopVectorization.zerorangestart(#looprangei###1###)::CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}
│   %62  = Core.tuple(%61)::Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}
│   %63  = ##grouped#strided#pointer####7###::LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}
│   %64  = Base.eltype(res::Core.Const(0.0))::Core.Const(Float64)
│   %65  = Core.tuple(%63, %64)::Core.PartialStruct(Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}, Any[LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, Type{Float64}])
│          (vargsym#332 = Core.tuple(%62, %65))
│   %67  = LoopVectorization.eltype(a1)::Core.Const(Float64)
│   %68  = LoopVectorization.eltype(a2)::Core.Const(Int64)
│   %69  = Base.eltype(res::Core.Const(0.0))::Core.Const(Float64)
│          (Tloopeltype## = LoopVectorization.promote_type(%67, %68, %69))
│          (Wvecwidth## = LoopVectorization.pick_vector_width(Tloopeltype##::Core.Const(Float64)))
│   %72  = LoopVectorization._turbo_!::Core.Const(LoopVectorization._turbo_!)
│   %73  = Core.apply_type(Main.Val, (false, 0, 0, 0, false, 0x0000000000000001, 1, true))::Core.Const(Val{(false, 0, 0, 0, false, 0x0000000000000001, 1, true)})
│   %74  = (%73)()::Core.Const(Val{(false, 0, 0, 0, false, 0x0000000000000001, 1, true)}())
│   %75  = LoopVectorization.avx_config_val(%74, Wvecwidth##)::Core.Const(Val{(false, 0, 0, 0, false, 4, 32, 15, 64, 0x0000000000000001, 1, true)}())
│   %76  = Main.Val::Core.Const(Val)
│   %77  = Core.tuple(:LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0001, 0x0001), :LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0002, 0x0002), Symbol("##DROPPED#CONSTANT##"), Symbol("##DROPPED#CONSTANT##"), LoopVectorization.OperationStruct(0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.constant, 0x0003, 0x0000), :LoopVectorization, :vfmadd_fast, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000100020003, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.compute, 0x0003, 0x0000))::Core.Const((:LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0001, 0x0001), :LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0002, 0x0002), Symbol("##DROPPED#CONSTANT##"), Symbol("##DROPPED#CONSTANT##"), LoopVectorization.OperationStruct(0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.constant, 0x0003, 0x0000), :LoopVectorization, :vfmadd_fast, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000100020003, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.compute, 0x0003, 0x0000)))
│   %78  = Core.apply_type(%76, %77)::Core.Const(Val{(:LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0001, 0x0001), :LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0002, 0x0002), Symbol("##DROPPED#CONSTANT##"), Symbol("##DROPPED#CONSTANT##"), LoopVectorization.OperationStruct(0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.constant, 0x0003, 0x0000), :LoopVectorization, :vfmadd_fast, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000100020003, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.compute, 0x0003, 0x0000))})
│   %79  = (%78)()::Core.Const(Val{(:LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0001, 0x0001), :LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0002, 0x0002), Symbol("##DROPPED#CONSTANT##"), Symbol("##DROPPED#CONSTANT##"), LoopVectorization.OperationStruct(0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.constant, 0x0003, 0x0000), :LoopVectorization, :vfmadd_fast, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000100020003, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.compute, 0x0003, 0x0000))}())
│   %80  = Main.Val::Core.Const(Val)
│   %81  = Core.tuple(LoopVectorization.ArrayRefStruct{:a1, Symbol("##vptr##_a1")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001), LoopVectorization.ArrayRefStruct{:a2, Symbol("##vptr##_a2")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001))::Core.Const((LoopVectorization.ArrayRefStruct{:a1, Symbol("##vptr##_a1")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001), LoopVectorization.ArrayRefStruct{:a2, Symbol("##vptr##_a2")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001)))
│   %82  = Core.apply_type(%80, %81)::Core.Const(Val{(LoopVectorization.ArrayRefStruct{:a1, Symbol("##vptr##_a1")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001), LoopVectorization.ArrayRefStruct{:a2, Symbol("##vptr##_a2")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001))})
│   %83  = (%82)()::Core.Const(Val{(LoopVectorization.ArrayRefStruct{:a1, Symbol("##vptr##_a1")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001), LoopVectorization.ArrayRefStruct{:a2, Symbol("##vptr##_a2")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001))}())
│   %84  = Main.Val::Core.Const(Val)
│   %85  = Core.tuple(4)::Core.Const((4,))
│   %86  = Core.tuple(3)::Core.Const((3,))
│   %87  = ()::Core.Const(())
│   %88  = ()::Core.Const(())
│   %89  = ()::Core.Const(())
│   %90  = ()::Core.Const(())
│   %91  = Core.tuple(0, %85, %86, %87, %88, %89, %90)::Core.Const((0, (4,), (3,), (), (), (), ()))
│   %92  = Core.apply_type(%84, %91)::Core.Const(Val{(0, (4,), (3,), (), (), (), ())})
│   %93  = (%92)()::Core.Const(Val{(0, (4,), (3,), (), (), (), ())}())
│   %94  = Main.Val::Core.Const(Val)
│   %95  = (:i,)::Core.Const((:i,))
│   %96  = Core.apply_type(%94, %95)::Core.Const(Val{(:i,)})
│   %97  = (%96)()::Core.Const(Val{(:i,)}())
│   %98  = Base.typeof(vargsym#332::Core.PartialStruct(Tuple{Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}}, Any[Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Core.PartialStruct(Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}, Any[LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, Type{Float64}])]))::Core.Const(Tuple{Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}})
│   %99  = Base.Val(%98)::Core.Const(Val{Tuple{Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}}}())
│   %100 = Core.tuple(%75, %79, %83, %93, %97, %99)::Core.Const((Val{(false, 0, 0, 0, false, 4, 32, 15, 64, 0x0000000000000001, 1, true)}(), Val{(:LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0001, 0x0001), :LoopVectorization, :getindex, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.memload, 0x0002, 0x0002), Symbol("##DROPPED#CONSTANT##"), Symbol("##DROPPED#CONSTANT##"), LoopVectorization.OperationStruct(0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.constant, 0x0003, 0x0000), :LoopVectorization, :vfmadd_fast, LoopVectorization.OperationStruct(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000100020003, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, 0x00000000000000000000000000000000, LoopVectorization.compute, 0x0003, 0x0000))}(), Val{(LoopVectorization.ArrayRefStruct{:a1, Symbol("##vptr##_a1")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001), LoopVectorization.ArrayRefStruct{:a2, Symbol("##vptr##_a2")}(0x00000000000000000000000000000001, 0x00000000000000000000000000000001, 0x00000000000000000000000000000000, 0x00000000000000000000000000000001))}(), Val{(0, (4,), (3,), (), (), (), ())}(), Val{(:i,)}(), Val{Tuple{Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}}}()))
│   %101 = LoopVectorization.flatten_to_tuple(vargsym#332::Core.PartialStruct(Tuple{Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}}, Any[Tuple{CloseOpenIntervals.CloseOpen{StaticInt{0}, Int64}}, Core.PartialStruct(Tuple{LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, DataType}, Any[LayoutPointers.GroupedStridedPointers{Tuple{Ptr{Float64}, Ptr{Int64}}, (1, 1), (0, 0), ((1,), (1,)), ((1,), (2,)), Tuple{StaticInt{8}, StaticInt{8}}, Tuple{StaticInt{0}, StaticInt{0}}}, Type{Float64}])]))::Tuple{Int64, Ptr{Float64}, Ptr{Int64}, LoopVectorization.StaticType{Float64}}
│   %102 = Core._apply_iterate(Base.iterate, %72, %100, %101)::Any
│          (res_##onevec## = %102)
│          $(Expr(:gc_preserve_end, :(%60)))
│   %105 = LoopVectorization.vecmemaybe(res_##onevec##)::Any
│          (res = LoopVectorization.reduced_add(%105, res::Core.Const(0.0)))
└──        goto #6
5 ┄        Core.Const(Core.NewvarNode(:(msg)))
│          Core.Const(Core.NewvarNode(:(kwargs)))
│          Core.Const(Core.NewvarNode(:(line)))
│          Core.Const(Core.NewvarNode(:(file)))
│          Core.Const(Core.NewvarNode(:(id)))
│          Core.Const(Core.NewvarNode(:(logger)))
│          Core.Const(Core.NewvarNode(:(_module)))
│          Core.Const(Core.NewvarNode(:(group)))
│          Core.Const(:(level = Base.CoreLogging.Warn))
│          Core.Const(:(std_level = level))
│          Core.Const(:(std_level))
│          Core.Const(:(Base.getindex(Base.RefValue{Base.CoreLogging.LogLevel}(Debug))))
│          Core.Const(:(%118 >= %119))
│          Core.Const(:(goto %159 if not %120))
│          Core.Const(:(group = :condense_loopset))
│          Core.Const(:(_module = Main))
│          Core.Const(:(logger = (Base.CoreLogging.current_logger_for_env)(std_level, group, _module)))
│          Core.Const(:(logger === Base.CoreLogging.nothing))
│          Core.Const(:(!%125))
│          Core.Const(:(goto %159 if not %126))
│          Core.Const(:(id = :Main_8a836deb))
│          Core.Const(:(Base.CoreLogging.invokelatest(Base.CoreLogging.shouldlog, logger, level, _module, group, id)))
│          Core.Const(:(goto %159 if not %129))
│          Core.Const(:(file = "/storage/niggeni/.julia_hexagon/packages/LoopVectorization/7gWfp/src/condense_loopset.jl"))
│          Core.Const(:(file isa Base.CoreLogging.String))
│          Core.Const(:(goto %136 if not %132))
│          Core.Const(:(Base.fixup_stdlib_path))
│          Core.Const(:(file = (%134)(file)))
│          Core.Const(:(line = 1148))
│          Core.Const(nothing)
│          Core.Const(:(err = %137))
│          Core.Const(:(err === Base.CoreLogging.nothing))
│          Core.Const(:(goto %148 if not %139))
│          Core.Const(:(msg = "#= /storage/niggeni/.julia_hexagon/dev/VectorizationBase/promotiontest/promotion.jl:33 =#:\n`LoopVectorization.check_args` on your inputs failed; running fallback `@inbounds @fastmath` loop instead.\nUse `warn_check_args=false`, e.g. `@turbo warn_check_args=false ...`, to disable this warning."))
│          Core.Const((:maxlog,))
│          Core.Const(:(Core.apply_type(Core.NamedTuple, %142)))
│          Core.Const(:(Core.tuple(1)))
│          Core.Const(:(kwargs = (%143)(%144)))
│          Core.Const(:(@_35 = true))
│          Core.Const(:(goto %150))
│          Core.Const(:(Base.invokelatest(Base.CoreLogging.logging_error, logger, level, _module, group, id, file, line, err, false)))
│          Core.Const(:(@_35 = false))
│          Core.Const(:(goto %159 if not @_35))
│          Core.Const(:(Base.NamedTuple()))
│          Core.Const(:(Base.merge(%151, kwargs)))
│          Core.Const(:(Base.isempty(%152)))
│          Core.Const(:(goto %157 if not %153))
│          Core.Const(:(Base.CoreLogging.invokelatest(Base.CoreLogging.handle_message, logger, level, msg, _module, group, id, file, line)))
│          Core.Const(:(goto %158))
│          Core.Const(:(Core.kwcall(%152, Base.CoreLogging.invokelatest, Base.CoreLogging.handle_message, logger, level, msg, _module, group, id, file, line)))
│          Core.Const(:(goto %159))
│          Core.Const(:(Base.CoreLogging.nothing))
│          Core.Const(nothing)
│          Core.Const(:(Main.eachindex(a1, a2)))
│          Core.Const(:(@_4 = Base.iterate(%161)))
│          Core.Const(:(@_4 === nothing))
│          Core.Const(:(Base.not_int(%163)))
│          Core.Const(:(goto %183 if not %164))
│          Core.Const(:(@_4))
│          Core.Const(:(i = Core.getfield(%166, 1)))
│          Core.Const(:(Core.getfield(%166, 2)))
│          Core.Const(:(Base.FastMath))
│          Core.Const(:(Base.getproperty(%169, :add_fast)))
│          Core.Const(:(res))
│          Core.Const(:(Base.FastMath))
│          Core.Const(:(Base.getproperty(%172, :mul_fast)))
│          Core.Const(:(Base.getindex(a1, i)))
│          Core.Const(:(Base.getindex(a2, i)))
│          Core.Const(:((%173)(%174, %175)))
│          Core.Const(:(res = (%170)(%171, %176)))
│          Core.Const(:(@_4 = Base.iterate(%161, %168)))
│          Core.Const(:(@_4 === nothing))
│          Core.Const(:(Base.not_int(%179)))
│          Core.Const(:(goto %183 if not %180))
│          Core.Const(:(goto %166))
│          Core.Const(:(val@_5 = nothing))
│          Core.Const(nothing)
└──        Core.Const(:(val@_5))
6 ┄        return res

Any ideas on how to make progress?
Another weird thing is that I found VectorizationBase.promote(vxiu,vxfu,vxfu) to throw an error on one of my other machines, (from Base.note_sametype).
Should I open a separate issue on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
2 participants