Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When working on types, inference rather greedily widens #36713

Closed
bramtayl opened this issue Jul 17, 2020 · 13 comments
Closed

When working on types, inference rather greedily widens #36713

bramtayl opened this issue Jul 17, 2020 · 13 comments
Labels
compiler:inference Type inference missing data Base.missing and related functionality

Comments

@bramtayl
Copy link
Contributor

bramtayl commented Jul 17, 2020

Running this code:

function test()
    data = (a = if rand(Bool)
        1
    else
        "1"
    end,)
    merge(data, (b = data.a,))
end

The returned value from inference is NamedTuple{(:a, :b),_A} where _A<:Tuple.
I would expect it to be NamedTuple{(:a, :b), T} where T <: Tuple{Union{Int, String}, Union{Int, String}}.
This would be useful for processing rows of a table.

@KristofferC KristofferC added compiler:inference Type inference missing data Base.missing and related functionality labels Jul 17, 2020
@pdeffebach
Copy link
Contributor

Is #35504 relevant here?

@bramtayl
Copy link
Contributor Author

bramtayl commented Aug 5, 2020

I think this one is different but I'm not sure

@nalimilan
Copy link
Member

master seems to infer the type of variables better, so it may not be that far from inferring the return type too:
Julia 1.5.0:

julia> @code_warntype test()
Variables
  #self#::Core.Compiler.Const(test, false)
  data::NamedTuple{(:a,),_A} where _A<:Tuple
  @_3::Union{Int64, String}

Body::NamedTuple{(:a, :b),_A} where _A<:Tuple

master:

julia> @code_warntype test()
Variables
  #self#::Core.Const(test, false)
  data::NamedTuple{(:a,),_A} where _A<:Tuple{Union{Int64, String}}
  @_3::Union{Int64, String}

Body::NamedTuple{(:a, :b),_A} where _A<:Tuple

@martinholters
Copy link
Member

The problem is this function:

julia/base/namedtuple.jl

Lines 204 to 208 in 7c0cb30

@pure function merge_types(names::Tuple{Vararg{Symbol}}, a::Type{<:NamedTuple}, b::Type{<:NamedTuple})
@nospecialize names a b
bn = _nt_names(b)
return Tuple{Any[ fieldtype(sym_in(names[n], bn) ? b : a, names[n]) for n in 1:length(names) ]...}
end

With sufficient knowledge about the parameters, it can be run at compile time (thanks to the @pure). But otherwise, it can obviously at best be inferred as Type{<:Tuple}. (in fact, it's only inferred as Any, but that doesn't matter much.)

@bramtayl
Copy link
Contributor Author

bramtayl commented Aug 7, 2020

I'm not sure I understand...seems like Julia should have all the information it needs?

@martinholters
Copy link
Member

Ah, sorry, that might ave been a bit terse. Note that merge_types is invoked with the types, as the name suggests. Those are only inferred as NamedTuple{(:a,),_A} where _A<:Tuple{Union{Int64, String}}, so it is unknown on which type the function will be actually run (could be NamedTuple{(:a,),Tuple{Int64}} or NamedTuple{(:a,),Tuple{String}}, or from only this information even NamedTuple{(:a,),Tuple{Union{}}}. So how should we run it at compile time? In general, I think a @pure function can only be evaluated at compile-time if its arguments could either be inferred as Consts or as singleton types, so that their values are known at compile-time. How else to invoke it?

Note that if the types were NamedTuple{(:a,), Tuple{Union{Int, String}}} that would be an entirely different story.

@bramtayl
Copy link
Contributor Author

bramtayl commented Aug 7, 2020

Oh I see, got it. Still seems like NamedTuple{(:a,),_A} where _A<:Tuple{Union{Int64, String}} should be enough to figure out some information about both the field names :a and the field types Union{Int64, String}?

@bramtayl
Copy link
Contributor Author

bramtayl commented Aug 8, 2020

I guess, one would hope that anything infer-able for Tuples would also be infer-able for named tuples, and it infers fine with a tuple:

function test()
  data = (if rand(Bool)
      1
  else
      "1"
  end,)
  (data..., data...)
end
@code_warntype test()

@bramtayl
Copy link
Contributor Author

bramtayl commented Aug 9, 2020

Am I correct in thinking that the real problem is that generated and pure functions can only be used by the compiler ahead of time if arguments are leaf-type, even if sufficient type information would be available otherwise? I could change the title if so.

@bramtayl bramtayl changed the title Imprecise inference when merging a union value into a NamedTuple of unions Use pure functions on union types if possible Aug 9, 2020
@martinholters
Copy link
Member

martinholters commented Aug 10, 2020

While that would be helpful in this specific case, one would also hope that e.g. merge(::NamedTuple{(:a,),<:Integer}, ::NamedTuple{(:b,),<:Integer}) could be inferred as NamedTuple{(:a, :b), <:Tuple{Integer,Integer}} (instead of NamedTuple{(:a, :b), <:Tuple like now). For that, it would help if merge_types could be inferred better even if it is not evaluated during compilation.

In a first step, that would mean replacing the Any[...] comprehension with something that does not loose as much information. E.g. replacing that line in merge_types with

    return Tuple{ntuple(n -> fieldtype(sym_in(names[n], bn) ? b : a, names[n]),  Val(length(names)))...} 

results in test() being inferred as NamedTuple{(:a, :b),_A} where _A<:Tuple{Any,Vararg{Any}}. The main culprit preventing a more precise result is that when working on types, inference rather greedily widens eveyrthing to just Type and Tuple{T1,T2} where T1 and T2 are known to be Types is inferred as Tuple{Any,Vararg{Any}} because T2 could be Vararg.

@bramtayl
Copy link
Contributor Author

Well, you've lost me again but I believe you

@bramtayl bramtayl changed the title Use pure functions on union types if possible When working on types, inference rather greedily widens Aug 10, 2020
@bramtayl
Copy link
Contributor Author

I watched the JuliaCon talk where @StefanKarpinski mentioned prioritizing making Julia faster for tabular data and I think solving this would be an important step

@vtjnash vtjnash closed this as completed Aug 7, 2024
@vtjnash
Copy link
Member

vtjnash commented Aug 7, 2024

Inference now is fine with this, and generates the optimal types

│  │ %39 = Core.typeof(%38)::Type{<:Tuple{Union{Int64, String}}}
│  │ %40 = Core.apply_type(Core.NamedTuple, (:b,), %39)::Type{NamedTuple{(:b,), var"#s183"}} where var"#s183"<:Tuple{Union{Int64, String}}
│  │ %41 = %new(%40, %33)::NamedTuple{(:b,), <:Tuple{Union{Int64, String}}}

Note that because NamedTuple uses a lot of @generated functions, the implementation of NamedTuple then discards the information by inference, since it is not possible to infer through that. So the final result is not better, but that is not an inference issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference missing data Base.missing and related functionality
Projects
None yet
Development

No branches or pull requests

6 participants