Create a function to use type inference for eltype #54157

Tortar · 2024-04-19T21:45:29Z

I'm using this function to compute the type generically for iterators:

function compute_eltype(iter)
    T = eltype(iter)
    return T === Any ? Base.@default_eltype(iter) : T
end

and I'm asking myself if this couldn't be applied by default to some generators and iterators which currently returns Any (actually if it is applied specifically to just those types it can be used only Base.@default_eltype(iter) I think)

For example consider

julia> iter = Iterators.flatten(((3,), [4, 5]))
Base.Iterators.Flatten{Tuple{Tuple{Int64}, Vector{Int64}}}(((3,), [4, 5]))

julia> iter = Iterators.flatten(((3,), [4, 5]))

julia> eltype(iter)
Any

julia> compute_eltype(iter)
Int64

julia> iter = (x^2 for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#3#4"}(var"#3#4"(), 1:10)

julia> eltype(iter)
Any

julia> compute_eltype(iter)
Int64

Performance-wise I see a cost of a couple of ns. I found that there is this other issue and associated PR about a special case of this: #48249 and #48277, but I think that a function as above could be applied more generally. Would this be okay? Or is this problematic?

The text was updated successfully, but these errors were encountered:

nsajko · 2024-04-21T11:37:59Z

As far as I remember, Base.@default_eltype relies on type inference. Type inference works on a best-effort basis, so it's good for optimizations in some cases, but mustn't be relied on for correctness. So the change you're proposing wouldn't be correct.

Apart from that, the run time cost of type inference wouldn't be acceptable for eltype.

Tortar · 2024-04-21T19:24:45Z

Consider that though collect(::Base.Generator) relies on Base.@default_eltype to infer the type for the array. So it would be logical in my opinion to use it for eltype.

Also I see very good performance using it on some micro-benchmarks:

julia> using BenchmarkTools

julia> iter = (x for x in 1:10)
Base.Generator{UnitRange{Int64}, typeof(identity)}(identity, 1:10)

julia> @benchmark eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.182 ns … 6.061 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.192 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.197 ns ± 0.056 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂        █▆       ▄▃                 ▁         ▃▂       ▁ ▁
  █▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁██▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁█ █
  1.18 ns     Histogram: log(frequency) by time     1.24 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Base.@default_eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.182 ns … 6.843 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.192 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.202 ns ± 0.085 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           █                                                 
  ▃▁▁▁▁▁▁▁▁█▆▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▃▂▁▁▁▁▁▁▁▄▃▁▁▁▁▁▁▁▁▃ ▂
  1.18 ns        Histogram: frequency by time       1.24 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> iter = (x < 5 ? x : 2.0*x for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#1#2"}(var"#1#2"(), 1:10)

julia> @benchmark eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.182 ns … 22.753 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.192 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.203 ns ±  0.218 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           █                                                  
  ▃▁▁▁▁▁▁▁▁█▆▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▄▃▁▁▁▁▁▁▁▁▃ ▂
  1.18 ns        Histogram: frequency by time        1.24 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Base.@default_eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.182 ns … 6.622 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.193 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.209 ns ± 0.073 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           █                                                 
  ▂▁▁▁▁▁▁▁▁█▆▁▁▁▁▁▁▁▃▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▃▂▁▁▁▁▁▁▁▇▄▁▁▁▁▁▁▁▁▅ ▂
  1.18 ns        Histogram: frequency by time       1.24 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

nsajko · 2024-04-23T00:10:11Z

xref #54207

Tortar · 2024-04-23T12:08:12Z

Indeed, it would be really okay if @default_eltype was just public API so that I could just use it as a fallback with confidence when eltype returns Any which can be also mentioned in the docstring:

infer_eltype(itr) = (T = eltype(itr)) === Any ? @default_eltype(itr) : T

Tortar · 2024-04-23T15:06:41Z

Another possibility is to create a new function as above and export it or even something maybe a bit stronger as

infer_eltype(itr) = ifelse((T2 = Base.@default_eltype(itr)) <: (T1 = eltype(itr)), T2, T1)

with all the needed disclaimers in the docstring

LilithHafner · 2024-04-23T16:25:12Z

We might want to special case Union{} because sometimes type inference can infer that the itr is empty, in which case it tells us Union{} which is true and precise and not usually what folks are looking for

julia> Base.@default_eltype Iterators.Filter([1,2,3]) do x
           sum([1]) == 0
       end
Int64

julia> Base.@default_eltype Iterators.Filter([1,2,3]) do x
           sum(1) == 0
       end
Union{}

Tortar · 2024-04-23T18:00:01Z

A thing I don't understand is why in Base.@default_eltype (https://github.com/JuliaLang/julia/blob/master/base/array.jl#L744) it is used promote_typejoin_union(T), I would prefer to infer a Union of concrete types than an abstract type e.g.

julia> s = (x < 5 ? x : 2.0*x for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#5#6"}(var"#5#6"(), 1:10)

julia> T = Core.Compiler.return_type(Base._iterator_upper_bound, Tuple{typeof(s)})
Union{Float64, Int64}

julia> Base.promote_typejoin_union(T)
Real

Why can't we use Core.Compiler.return_type(Base._iterator_upper_bound, Tuple{typeof(s)}) at least when the Union is not too big? Notice that this affects also collect:

julia> collect(s)
10-element Vector{Real}:
...

vtjnash · 2024-04-23T18:09:45Z

collect does not return a Union, as it gets the types from the real values and not from inference

Tortar · 2024-04-23T18:46:15Z

right, now I see, it widens the type. Nonetheless I guess with an infer_eltype function which returns also Unions one can probably take advantage of this with collect(infer_eltype(itr), itr)

Tortar · 2024-04-23T19:04:21Z

@vtjnash this is a bit derailing the conversation, but I'd like to ask you: is it a bad idea to return a Union (of not too many types)? e.g. I was able to return a Vector{Union{Int, Float64}} in the above case by changing https://github.com/JuliaLang/julia/blob/master/base/array.jl#L822 in new = similar(dest, Union{T, typeof(el)}).

vtjnash · 2024-04-23T19:26:23Z

Yes. That will never be the type of the object when it actually contains data, so it slows down the case of when there is data to benefit the case where there is not data

vtjnash · 2024-04-23T19:27:32Z

You typically want the static and dynamic types to nearly coincide, for optimal results

aplavin · 2024-04-24T00:45:07Z

We might want to special case Union{} because sometimes type inference can infer that the itr is empty, in which case it tells us Union{} which is true and precise and not usually what folks are looking for

It would be really unfortunate, especially if this avoidance of Union{} is motivated solely by current compiler limitations (and not language design concerns).

Do you suggest turning Union{} into Any, or something else? There's just no other type that would retain the nice invariant that the (inferred) eltype of concatenation of A + B is a supertype of A and B eltypes.

Tortar · 2024-04-24T01:00:14Z

If we are going the route where we export a new function, which to me seems preferable, I (maybe mis)understood what @LilithHafner has suggested to be implemented as for example

function infer_eltype(itr)
    T1, T2 = eltype(itr), Base.@default_eltype(itr)
    ifelse(T2 !== Union{} && T2 <: T1, T2, T1)
end

LilithHafner · 2024-04-24T13:13:26Z

It would be really unfortunate, especially if this avoidance of Union{} is motivated solely by current compiler limitations (and not language design concerns).

Inference is tasked with finding the narrowest type it can come up with that is a supertype of all types of instances that may actually occur. Coming up with a supertype of all types that could actually occur is a correctness thing: it has to get that right every time or we have compiler bugs. Picking a narrow type is on a best-effort basis. Returns(Any) is a perfectly valid type inference algorithm for Julia, just not a particularly performant one. Consequently a, when it is possible to prove that an iterator is empty, a really good type inference algorithm will produce Union{}, which is the narrowest type. Union{} is not a supertype of any concrete type, but nevertheless it is (vacuously) a supertype of all types of instances that may actually occur in the (empty) iterator. This is all at the language level. We happen to be running into this problem now as type inference is good enough to make these emptiness proofs. In the past some folks (including Base) have at times relied on compiler limitations by expecting type inference not to return Union{} here.

Do you suggest turning Union{} into Any, or something else? There's just no other type that would retain the nice invariant that the (inferred) eltype of concatenation of A + B is a supertype of A and B eltypes.

We cannot guarantee that in any case so long as we rely on inference because inference may decide to provide an arbitrarily wide eltype for A and/or B. Even turning Union{} into Any would not help. For example if A infers to Union{} and B infers to Int, it is quite plausible for their concatenation to infer to Int, which is not a supertype of Any.

Tortar · 2024-04-24T21:07:43Z

if we want to return the narrowest type possible would then be fine to use (a macro version of) Core.Compiler.return_type instead of Base.@default_eltype(itr) so that to infer Union types in some cases instead of abstract ones?

aplavin · 2024-04-24T21:17:54Z

when it is possible to prove that an iterator is empty, a really good type inference algorithm will produce Union{}, which is the narrowest type.<...> We happen to be running into this problem now as type inference is good enough to make these emptiness proofs.

This is great – inference is improving!

In the past some folks (including Base) have at times relied on compiler limitations by expecting type inference not to return Union{} here.

And this seems to be the actual issue.
It's just really weird to make the inferred type artificially wider, as if the inference was less precise.

LilithHafner · 2024-04-25T00:36:55Z

And this seems to be the actual issue.
It's just really weird to make the inferred type artificially wider, as if the inference was less precise.

Which raises the question: when would someone use this infer_eltype function?

Tortar · 2024-04-29T14:09:52Z

Which raises the question: when would someone use this infer_eltype function?

I would say at least whoever is using Base.@default_eltype currently, actually probably more because it is not even exported now

Tortar · 2024-04-29T14:17:08Z

Seems like around 20 packages some of which very popular are using it: https://juliahub.com/ui/Search?q=@default_eltype&type=code

LilithHafner · 2024-04-29T14:37:46Z

Yeah, for example

function _DisjointSet(xs, ::Base.EltypeUnknown)
    T = Base.@default_eltype(xs)
    (isconcretetype(T) || T === Union{}) || return Base.grow_to!(DisjointSet{T}(), xs)
    return DisjointSet{T}(xs)
end

to

function _DisjointSet(xs, ::Base.EltypeUnknown)
    T = infer_eltype(xs)
    isconcretetype(T) || return Base.grow_to!(DisjointSet{T}(), xs
    return DisjointSet{T}(xs)
end

has one semantic change: when inference can infer that the iterator is empty you get at DisjointSet{eltype(iter)}() instead of a DisjointSet{Union{}}().
(source)

LilithHafner · 2024-04-29T14:40:02Z

IMO a pull request adding infer_eltype (with appropriate docs) should be accepted.

Tortar · 2024-04-30T17:51:25Z

I could try to make a PR but I'm still unsure about a thing, consider:

julia> struct A end

julia> struct B end

julia> itr = (x < 5 ? A() : B() for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#1#2"}(var"#1#2"(), 1:10)

julia> Base.@default_eltype(itr)
Any

julia> Core.Compiler.return_type(Base._iterator_upper_bound, Tuple{typeof(itr)})
Union{A, B}

I would really prefer the second one, so I think a possible infer_eltype function should not use Base.@default_eltype but a narrower version of it (without promote_typejoin_union at the end), is that okay?

Tortar · 2024-04-30T17:55:38Z

ok, maybe we could actually have two versions, with a tight/narrow = false argument so that the user can choose

LilithHafner · 2024-04-30T20:19:57Z

Sure, IIUC that promote_typejoin_union dates back to when the compiler was bad at handling Unions performantly, so it would be reasonable to remove. No need for an option, the user can always call promote_typejoin_union on the result. It's also possible to continue this discussion once you open a PR (most PRs have a bit of discussion and revision before merging).

vtjnash · 2024-05-01T00:26:49Z

It was a recent performance improvement for collect, which is designed not to return random Union types normally (only having trouble with the empty case). The exact PR for it should be findable by git blame if interested in the full picture

LilithHafner · 2024-05-01T00:36:35Z

The PR that changed @default_eltype was #42046.

collect has been returning abstract types instead of unions since at least 1.0.

vtjnash · 2024-05-01T00:42:54Z

Yes, looks like the main explanation of that PR is in #30485 though

Tortar · 2024-06-23T23:06:38Z

Tried to make a PR to create infer_eltype

LilithHafner · 2024-08-15T13:23:05Z

We don't plan on supporting this for reasons mentioned in #54909 (comment)

Tortar mentioned this issue Apr 23, 2024

Refine eltype of Base.Generator when its type is concrete #54208

Closed

Tortar changed the title ~~Use Base.@default_eltype for iterators and generators eltype function which currently returns Any?~~ Create a function to use type inference for elty Jun 23, 2024

Tortar changed the title ~~Create a function to use type inference for elty~~ Create a function to use type inference for eltype Jun 23, 2024

Tortar mentioned this issue Jun 23, 2024

Create infer_eltype function #54909

Closed

LilithHafner closed this as not planned Won't fix, can't repro, duplicate, stale Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a function to use type inference for eltype #54157

Create a function to use type inference for eltype #54157

Tortar commented Apr 19, 2024 •

edited

Loading

nsajko commented Apr 21, 2024

Tortar commented Apr 21, 2024 •

edited

Loading

nsajko commented Apr 23, 2024

Tortar commented Apr 23, 2024 •

edited

Loading

Tortar commented Apr 23, 2024 •

edited

Loading

LilithHafner commented Apr 23, 2024

Tortar commented Apr 23, 2024

vtjnash commented Apr 23, 2024

Tortar commented Apr 23, 2024

Tortar commented Apr 23, 2024

vtjnash commented Apr 23, 2024

vtjnash commented Apr 23, 2024

aplavin commented Apr 24, 2024 •

edited

Loading

Tortar commented Apr 24, 2024 •

edited

Loading

LilithHafner commented Apr 24, 2024

Tortar commented Apr 24, 2024 •

edited

Loading

aplavin commented Apr 24, 2024

LilithHafner commented Apr 25, 2024

Tortar commented Apr 29, 2024 •

edited

Loading

Tortar commented Apr 29, 2024 •

edited

Loading

LilithHafner commented Apr 29, 2024 •

edited

Loading

LilithHafner commented Apr 29, 2024

Tortar commented Apr 30, 2024

Tortar commented Apr 30, 2024 •

edited

Loading

LilithHafner commented Apr 30, 2024

vtjnash commented May 1, 2024

LilithHafner commented May 1, 2024

vtjnash commented May 1, 2024

Tortar commented Jun 23, 2024

LilithHafner commented Aug 15, 2024

Create a function to use type inference for eltype #54157

Create a function to use type inference for eltype #54157

Comments

Tortar commented Apr 19, 2024 • edited Loading

nsajko commented Apr 21, 2024

Tortar commented Apr 21, 2024 • edited Loading

nsajko commented Apr 23, 2024

Tortar commented Apr 23, 2024 • edited Loading

Tortar commented Apr 23, 2024 • edited Loading

LilithHafner commented Apr 23, 2024

Tortar commented Apr 23, 2024

vtjnash commented Apr 23, 2024

Tortar commented Apr 23, 2024

Tortar commented Apr 23, 2024

vtjnash commented Apr 23, 2024

vtjnash commented Apr 23, 2024

aplavin commented Apr 24, 2024 • edited Loading

Tortar commented Apr 24, 2024 • edited Loading

LilithHafner commented Apr 24, 2024

Tortar commented Apr 24, 2024 • edited Loading

aplavin commented Apr 24, 2024

LilithHafner commented Apr 25, 2024

Tortar commented Apr 29, 2024 • edited Loading

Tortar commented Apr 29, 2024 • edited Loading

LilithHafner commented Apr 29, 2024 • edited Loading

LilithHafner commented Apr 29, 2024

Tortar commented Apr 30, 2024

Tortar commented Apr 30, 2024 • edited Loading

LilithHafner commented Apr 30, 2024

vtjnash commented May 1, 2024

LilithHafner commented May 1, 2024

vtjnash commented May 1, 2024

Tortar commented Jun 23, 2024

LilithHafner commented Aug 15, 2024

Tortar commented Apr 19, 2024 •

edited

Loading

Tortar commented Apr 21, 2024 •

edited

Loading

Tortar commented Apr 23, 2024 •

edited

Loading

Tortar commented Apr 23, 2024 •

edited

Loading

aplavin commented Apr 24, 2024 •

edited

Loading

Tortar commented Apr 24, 2024 •

edited

Loading

Tortar commented Apr 24, 2024 •

edited

Loading

Tortar commented Apr 29, 2024 •

edited

Loading

Tortar commented Apr 29, 2024 •

edited

Loading

LilithHafner commented Apr 29, 2024 •

edited

Loading

Tortar commented Apr 30, 2024 •

edited

Loading