Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add inference lattice element for Tuples #28955

Merged
merged 4 commits into from
Oct 6, 2018
Merged

add inference lattice element for Tuples #28955

merged 4 commits into from
Oct 6, 2018

Conversation

jrevels
Copy link
Member

@jrevels jrevels commented Aug 29, 2018

w/ @vtjnash

There are probably still some cases to handle and tests to add here...this is just a dump of what we currently have.

Once this PR is complete, it should allow for more robust constant prop of vargs/tuple elements.

@jrevels
Copy link
Member Author

jrevels commented Aug 29, 2018

@@ -13,32 +13,27 @@ mutable struct InferenceResult
else
result = linfo.rettype
end
return new(linfo, EMPTY_VECTOR, result, nothing)
return new(linfo, compute_inf_result_argtypes(linfo), result, nothing)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I was going to basically rename get_argtypes(::InferenceResult) to init_argtypes, but then I couldn't find a place where it ever looked like we wanted to construct an InferenceResult without initializing the argtypes. Hence this change, and getting rid of get_argtypes(::InferenceResult) altogether.

Am I correct here that - currently, anyway - all cases where we construct an InferenceResult, we also want to initialize argtypes?

@JeffBezanson JeffBezanson changed the title WIP: add type lattice element for Tuples WIP: add inference lattice element for Tuples Aug 30, 2018
@JeffBezanson JeffBezanson added the compiler:inference Type inference label Aug 30, 2018
@JeffBezanson
Copy link
Member

I'm looking at this. @jrevels can you provide a test case or two?

@@ -808,7 +803,9 @@ function assemble_inline_todo!(ir::IRCode, linetable::Vector{LineInfoNode}, sv::
# and if rewrite_apply_exprargs can deal with this form
ok = true
for i = 3:length(atypes)
typ = widenconst(atypes[i])
typ = atypes[i]
typ isa PartialTuple && continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong; a PartialTuple can still be vararg I think?

vargs = nothing
if !toplevel && caller_argtypes !== nothing
for i in 1:length(caller_argtypes)
a = maybe_widen_conditional(given_argtypes[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be caller_argtypes[i]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yup! Thanks for that catch.

Keno added a commit that referenced this pull request Sep 18, 2018
Previously these sorts of function would block constant propagation.
Hopfully #28955 will just fix this, but until then, add a surgical
fix and a test.
Keno added a commit that referenced this pull request Sep 19, 2018
Previously these sorts of function would block constant propagation.
Hopfully #28955 will just fix this, but until then, add a surgical
fix and a test.
@jrevels
Copy link
Member Author

jrevels commented Sep 19, 2018

I think this test case is at least representative of the issue this PR seeks to resolve (e.g. JuliaLabs/Cassette.jl#71), though it might end up being e.g. necessary but insufficient:

julia> a(f, args...) = f(args...)
a (generic function with 1 method)

julia> b(args::Tuple) = a(args...)
b (generic function with 1 method)

julia> c(args...) = b(args)
c (generic function with 1 method)

julia> d(f, x, y) = c(f, Bool, x, y)
d (generic function with 1 method)

julia> f(::Type{Bool}, x, y) = x
f (generic function with 1 method)

julia> f(::DataType, x, y) = y
f (generic function with 2 methods)

# I want this to infer exactly Int64 as the output type
julia> @code_typed optimize=false d(f, 1, 2.0)
CodeInfo(
1 1%1 = (Main.c)(f, Main.Bool, x, y)::Union{Float64, Int64}                                                    │
  └──      return %1                                                                                              │
) => Union{Float64, Int64}

@jrevels jrevels changed the title WIP: add inference lattice element for Tuples add inference lattice element for Tuples Sep 19, 2018
@jrevels
Copy link
Member Author

jrevels commented Sep 19, 2018

Okay, I added the test case I constructed above, which now passes. Made some other changes based on @JeffBezanson's feedback (thanks!) that lets a bunch of other failing cases pass now. Let's see if CI passes.

In the meantime, I'll see if this fixes the Cassette problem cases that motivated this...

@jrevels
Copy link
Member Author

jrevels commented Sep 19, 2018

Okay, looks like this PR does resolve the OP in JuliaLabs/Cassette.jl#71, but even with this fix Cassette still requires those hacky manual primitives just to force specialization in order to pass its inferrability tests.

The overarching goal is to get rid of those entirely, but at least this is an improvement.

@jrevels
Copy link
Member Author

jrevels commented Sep 20, 2018

@nanosoldier runbenchmarks(ALL, vs = ":master")

@jrevels
Copy link
Member Author

jrevels commented Sep 20, 2018

:( no nanosoldier?

@jrevels
Copy link
Member Author

jrevels commented Sep 20, 2018

@nanosoldier runbenchmarks(ALL, vs = ":master")

...let's try again

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@jrevels
Copy link
Member Author

jrevels commented Sep 21, 2018

Oof. Was definitely hoping to see some better perf improvements here. Instead, it looks like this causes a bunch of regressions...I'll do some debugging tomorrow.

Keno added a commit that referenced this pull request Sep 22, 2018
Previously these sorts of function would block constant propagation.
Hopfully #28955 will just fix this, but until then, add a surgical
fix and a test.
@jrevels
Copy link
Member Author

jrevels commented Sep 24, 2018

So @vtjnash helped me fix the performance problems from the previous benchmarks - it turns out I was using Core.Typeof where I was supposed to use typeof, which coincidentally caused my test case here to pass but broke a bunch of other stuff. Fixing that fixed the benchmarks we tried, but now my original test case no longer passes.

The most recent commit is an attempt at more correctly handling the better type info we have floating around the cache. I skipped CI on it, though, because I messed something up somewhere and am getting a bootstrap BoundsError during compaction:

LoadError(at "compiler/compiler.jl" line 3: LoadError(at "compiler/bootstrap.jl" line 8: BoundsError(a=(), i=1)))
⋮
isassigned at ./array.jl:205
iterate at ./compiler/ssair/ir.jl:1118
iterate at ./compiler/ssair/ir.jl:1054
foreach at ./abstractarray.jl:1844
compact! at ./compiler/ssair/ir.jl:1254
run_passes at ./compiler/ssair/driver.jl:117
optimize at ./compiler/optimize.jl:160
typeinf at ./compiler/typeinfer.jl:35

isva_result_argtypes[nargs] = tuple_tfunc(result_argtypes[(nargs - 1):end])
return isva_result_argtypes
end
return result_argtypes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm replacing this whole method body with a call to matching_cache_argtypes(linfo, nothing) at least allows it to build past the point that I was getting a BoundsError before (though of course probably results in a bunch of cache misses)

for i in 1:length(given_argtypes)
a = given_argtypes[i]
result_argtypes[i] = !(isa(a, PartialTuple) || isa(a, Const)) ? widenconst(a) : a
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash Just wanted to make sure this change is correct; the cache uses maybe_widen_conditional(a) for these argtypes so it seemed like we should match that here

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

lol gg pkg

julia> code_typed(setindex!, (Array{Tuple{Pkg.REPLMode.CommandKind, Array{String, 1}, Union{Nothing, Function}, Tuple{Base.Pair{A, B} where B where A, Function, Array{Base.Pair{Symbol, Any}, 1}}, Array{Tuple{Union{Array{String, 1}, String, Bool, Int}, Pkg.REPLMode.OptionClass, Base.Pair{Symbol, Any}}, 1}, String, Union{Nothing, Markdown.MD}}, 1}, Tuple{Pkg.REPLMode.CommandKind, Array{String, 1}, typeof(Pkg.REPLMode.do_test!), Tuple{Base.Pair{Int64, Float64}, typeof(Pkg.REPLMode.parse_pkg), Array{Any, 1}}, Array{Tuple{String, Pkg.REPLMode.OptionClass, Base.Pair{Symbol, Bool}}, 1}, String, Markdown.MD}, Int64), optimize=false)
1-element Array{Any,1}:
 CodeInfo(
769 1 ─ %1 = (Base.convert)($(Expr(:static_parameter, 1)), x)::PartialTuple(Tuple{CommandKind,Array{String,1},typeof(do_test!),Tuple{Pair{Int64,Float64},typeof(parse_pkg),Array{Pair{Symbol,Any},1}},Array{Tuple{Union{Bool, Int64, Array{String,1}, String},OptionClass,Pair{Symbol,Any}},1},String,MD}, Any[CommandKind, Array{String,1}, Const(do_test!, false), Tuple{Pair{Int64,Float64},typeof(parse_pkg),Array{Pair{Symbol,Any},1}}, Array{Tuple{Union{Bool, Int64, Array{String,1}, String},OptionClass,Pair{Symbol,Any}},1}, String, MD])
    │        (Core.typeassert)(%1, $(Expr(:static_parameter, 1)))::Union{}  │
    │        Const(:((Base.arrayset)($(Expr(:boundscheck)), A, %2, i1)), false)::Union{}
    └──      Const(:(return %3), false)::Union{}                            │
) => Union{}

(load a copy of inference with the above diff into a working build of this branch, then BAM)

@Keno
Copy link
Member

Keno commented Oct 5, 2018

I am unable to reproduce the bitarray CI failure.

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

With the most recent commit, this PR now fixes JuliaLabs/Cassette.jl#71 again

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

Hmm...

Check out this example:

julia> x = rand(100, 100);

julia> f(x) = cumsum(x, dims=2)
f (generic function with 1 method)

On master:

julia> @time f(x);
  0.000033 seconds (11 allocations: 78.516 KiB)

Here:

julia> @time f(x);
  0.013461 seconds (149.31 k allocations: 3.267 MiB)

If we look at @code_typed optimize=false on both, they are exactly the same.

If we look at @code_typed on both, there are some extra Expr(:unreachable)s floating around on this branch. Diff: https://gist.github.com/jrevels/d7740244bfd07f70263f3ea9f6cefc46

If we look at @code_llvm on both, they are literally the same AFAICT (modulo some generated names, e.g. julia__accumulate!_249602206 vs julia__accumulate!_94174756 etc). Diff: https://gist.github.com/jrevels/0b37dd1f288ba0d4b1cd6888c38303de

Anybody see what's going on here? I don't...

@Keno
Copy link
Member

Keno commented Oct 5, 2018

The actual code for the accumulate might be different?

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

D'oh, I need to get my first cup of coffee down before posting GitHub comments. Thanks.

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

Hmm. I see statements like this floating around in the IR on this branch (looking at @code_typed Base._accumulate!(Base.add_sum, x, x, 2, nothing)):

%93  = (isa)(1, Int64)::Bool

Looks like the recent widenconst additions to the isa_tfunc might've been too aggressive? I'll play around with it EDIT: hmmm, seems to be something else...

EDIT 2: oh look, it's our good buddy mister unreachable invalid invoke

    32 ── %96  = (isa)(1, CartesianIndex{1})::Bool                                                         ││││
    └────        goto #34 if not %96                                                                       ││││
    33 ── %98  = invoke %92(_4::Array{Float64,2}, 1::CartesianIndex{1})::Int64

@Keno
Copy link
Member

Keno commented Oct 5, 2018

@nanosoldier runbenchmarks(ALL, vs = ":master")

Copy link
Member

@Keno Keno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, LGTM once CI and nanosoldier are ok.

@Keno
Copy link
Member

Keno commented Oct 5, 2018

Ok, looks like this is basically passing. @jrevels Do you want to squash this down a bit (probably combine most of the PartialTuple stuff into one commit, but maybe leave all the things we found along the way - codegen changes, the PiNode thing, separate and prior in the commit list)?

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

Squashed.

@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

Argh. Not sure what went wrong but my rebase somehow messed something up. Time to consult the reflog...

jrevels and others added 4 commits October 5, 2018 17:23
…ment for "partially constant" tuples

Previously, we hacked in an additional `InferenceResult` field to store varargs type information
in order to facilitate better constant propagation through varargs methods. There were many
other places, however, where constants moving in/out of tuples/varargs thwarted constant
propagation.

This commit removes the varargs hack, replacing it with a new inference lattice element
(`PartialTuple`) that represents tuples where some (but not all) of the elements are
constants. This allows us to follow through with constant propagation in more
situations involving tuple construction/destructuring, and also enabled a clean-up
of the `InferenceResult` caching code.
E.g. if we had `PiNode(1, CartesianIndex)`, we would eliminate that
because `1` was a constant. However, leaving this in allows the compiler
to realize that this code is unreachable, as well as guarding codegen
against having to wrok through invalid IR.
Unfortunately, we cannnot always rely on :invokes to have argument
values that match the declared ssa types (e.g. if the :invoke is
dynamically unreachable). Because of that, we cannot assert here,
but must instead emit a runtime trap.
@jrevels
Copy link
Member Author

jrevels commented Oct 5, 2018

Okay, I'm not sure how it happened but:

diff --git a/base/compiler/ssair/ir.jl b/base/compiler/ssair/ir.jl
index 44b14a1777..b341ad7734 100644
--- a/base/compiler/ssair/ir.jl
+++ b/base/compiler/ssair/ir.jl
@@ -903,7 +903,7 @@ function process_node!(compact::IncrementalCompact, result::Vector{Any},
         # performance turns out ok
         stmt = renumber_ssa2!(stmt, ssa_rename, used_ssas, late_fixup, result_idx, do_rename_ssa)::PiNode
         if !isa(stmt.val, AnySSAValue) && !isa(stmt.val, GlobalRef)
-            valtyp = isa(stmt.val, QuoteNode) ? typeof(stmt.val.val) : typeof(stmt.val)
+            valtyp = isa(stmt.val, QuoteNode) ? typeof(stmt.val.value) : typeof(stmt.val)
             if valtyp === stmt.typ
                 ssa_rename[idx] = stmt.val
                 return result_idx

lol. Anyway, should be fixed now...

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@Keno
Copy link
Member

Keno commented Oct 6, 2018

Looks good to me. @jrevels wanna do the honors?

@jrevels jrevels merged commit 0a685d3 into master Oct 6, 2018
@jrevels jrevels deleted the jr/tuplelattice branch October 6, 2018 03:10
@jrevels
Copy link
Member Author

jrevels commented Oct 6, 2018

Thanks again to @vtjnash @Keno and @JeffBezanson for your help here!

jrevels pushed a commit that referenced this pull request Dec 31, 2018
Previously these sorts of function would block constant propagation.
Hopfully #28955 will just fix this, but until then, add a surgical
fix and a test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants