Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internals: better representation for code #31191

Merged
merged 1 commit into from
Mar 29, 2019
Merged

internals: better representation for code #31191

merged 1 commit into from
Mar 29, 2019

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Feb 27, 2019

I had thought it might be nice to combine Lambda into MethodInstance, but it turned out to be really awkward since they need to be constructed at different times. We want MethodInstance to be fairly unique on the specialization-tuple (since it's our key for looking up various properties). But we may also need to store several copies of the code properties specialized at various times on assorted dimensions of different inferred properties and world-applicability bounds. That meant we were forced to mutate things at various points in time, and it was really easy for things to become inconsistent. I think it'll be conceptually simpler that each type does a bit less. I even feel much better about describing this in the dev docs (still to do).

This'll help solve many of the problems with edges getting
mis-represented and broken by adding an extra level of indirection between
MethodInstance (now really just representing a particular specialization of a
method) and the executable object, now called Lambda (representing some
functional operator that converts some input arguments to some output
values, with whatever metadata is convenient to contain there).
This fixes many of the previous representation problems with back-edges,
since a MethodInstance (like Method) no longer tries to also represent a
computation. That task is now relegated strictly to Lambda.

@vtjnash vtjnash added needs docs Documentation for this change is required needs nanosoldier run This PR should have benchmarks run on it labels Feb 27, 2019
@JeffBezanson
Copy link
Member

I think Lambda would benefit from a more specific name, indicating that it's a pretty low-level thing. Maybe NativeFunction, Callable, NativeCode, ... ?

base/compiler/typeinfer.jl Outdated Show resolved Hide resolved
@vtjnash
Copy link
Member Author

vtjnash commented Feb 28, 2019

I think NativeCode could be good. Or CodeInstance? (I think there's some chance in the future that it could merge with CodeInfo.)

@vchuravy
Copy link
Member

I like CodeInstance

@vtjnash vtjnash removed the needs docs Documentation for this change is required label Mar 1, 2019
@vtjnash
Copy link
Member Author

vtjnash commented Mar 1, 2019

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@vtjnash
Copy link
Member Author

vtjnash commented Mar 4, 2019

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@vtjnash vtjnash removed the needs nanosoldier run This PR should have benchmarks run on it label Mar 5, 2019
@vtjnash
Copy link
Member Author

vtjnash commented Mar 5, 2019

Alright, I admit I cheated a little bit to make the nanosoldier results looks pretty nice. Anyways, I'm pretty happy with this chunk of work now. There's still many improvements that can be made (such as correctness of invoke, correctness of codegen, track forward edges, use more caches for performance), but it should be enough to take care of known existing issues such as #29425 #29267 #29498 #28595, and set us up for further refinements in the future.

base/compiler/typeinfer.jl Outdated Show resolved Hide resolved
print(iob, " (method too new to be called from this world context.)")
elseif ex.world > max_world(method)
print(iob, " (method deleted before this world age.)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what changed to make this obsolete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Methods don't have world-ages bounds. They were never supposed to, and it doesn't make any logical sense to claim to have them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then what does the Method.max_world field mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't mean anything—that's why it's not correct to have and is now removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean that the world keyword argument to hasmethod should be deprecated (in 2.0)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, while Methods don't have any particularly meaningful world age, the lookup algorithm for a method does

src/builtins.c Outdated Show resolved Hide resolved
src/julia.h Outdated Show resolved Hide resolved
{
JL_TIMING(INFERENCE);
if (jl_typeinf_func == NULL)
return NULL;
if (jl_is_method(mi->def.method) && mi->def.method->unspecialized == mi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nospecialize is set on all arguments, will we still get a MethodInstance here to infer? It is useful to at least resolve things like loops and literals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice feature that this work will allow us to add. Since we don't have it now, I'm not concerned about it right now.

src/gf.c Outdated Show resolved Hide resolved
src/gf.c Show resolved Hide resolved
src/dump.c Outdated Show resolved Hide resolved
@Keno
Copy link
Member

Keno commented Mar 12, 2019

I see about a 10% performance regression in building the system image. Is that expected or am I doing something wrong?

@vtjnash
Copy link
Member Author

vtjnash commented Mar 12, 2019

Might be about right, as some correct computations are much more expensive than we had been doing. Would be helpful to know if you get some performance marks on what the hotspots are.

@vtjnash
Copy link
Member Author

vtjnash commented Mar 12, 2019

I sometimes see ~10% normal variance (or even just NFC PRs like #31306 seeming to show several percent variance), but does seem possibly consistent. The PR:

Base  ─────────── 26.874227 seconds
Base64  ─────────  3.679461 seconds
CRC32c  ─────────  0.008079 seconds
SHA  ────────────  0.170542 seconds
FileWatching  ───  0.104288 seconds
Unicode  ────────  0.006611 seconds
Mmap  ───────────  0.072765 seconds
Serialization  ──  1.190580 seconds
Libdl  ──────────  0.029516 seconds
Markdown  ───────  1.036088 seconds
LibGit2  ────────  2.653477 seconds
Logging  ────────  0.282062 seconds
Sockets  ────────  1.571644 seconds
Printf  ─────────  0.007632 seconds
Profile  ────────  0.195968 seconds
Dates  ──────────  1.756954 seconds
DelimitedFiles  ─  0.109094 seconds
Random  ─────────  0.713382 seconds
UUIDs  ──────────  0.012662 seconds
Future  ─────────  0.004585 seconds
LinearAlgebra  ──  9.753868 seconds
SparseArrays  ───  3.961595 seconds
SuiteSparse  ────  1.476663 seconds
Distributed  ────  6.395261 seconds
SharedArrays  ───  0.159251 seconds
Pkg  ──────────── 10.984870 seconds
Test  ───────────  0.871862 seconds
REPL  ───────────  0.789499 seconds
Statistics  ─────  0.187705 seconds
Stdlibs total  ── 48.199787 seconds
Sysimage built. Summary:
Total ───────  75.075785 seconds 
Base: ───────  26.874227 seconds 35.7961%
Stdlibs: ────  48.199787 seconds 64.2015%
Generating precompile statements... 904 generated in  83.484133 seconds (overhead  57.593296 seconds)
julia> @time include("compiler/compiler.jl")
 20.449931 seconds (34.89 M allocations: 1.706 GiB, 4.91% gc time)
 18.274587 seconds (22.55 M allocations: 1.088 GiB, 4.04% gc time)
 18.564789 seconds (22.49 M allocations: 1.085 GiB, 4.21% gc time)

master branch point:

Base  ─────────── 24.532865 seconds
Base64  ─────────  3.872494 seconds
CRC32c  ─────────  0.008283 seconds
SHA  ────────────  0.181091 seconds
FileWatching  ───  0.090749 seconds
Unicode  ────────  0.006631 seconds
Mmap  ───────────  0.072882 seconds
Serialization  ──  1.177631 seconds
Libdl  ──────────  0.029778 seconds
Markdown  ───────  2.044509 seconds
LibGit2  ────────  2.682191 seconds
Logging  ────────  0.309563 seconds
Sockets  ────────  1.529000 seconds
Printf  ─────────  0.006040 seconds
Profile  ────────  0.167854 seconds
Dates  ──────────  1.732500 seconds
DelimitedFiles  ─  0.108260 seconds
Random  ─────────  0.650892 seconds
UUIDs  ──────────  0.012112 seconds
Future  ─────────  0.005767 seconds
LinearAlgebra  ──  9.602296 seconds
SparseArrays  ───  3.823795 seconds
SuiteSparse  ────  1.450669 seconds
Distributed  ────  6.428315 seconds
SharedArrays  ───  0.158997 seconds
Pkg  ──────────── 10.749139 seconds
Test  ───────────  0.852206 seconds
REPL  ───────────  0.805774 seconds
Statistics  ─────  0.174339 seconds
Stdlibs total  ── 48.747497 seconds
Sysimage built. Summary:
Total ───────  73.282180 seconds 
Base: ───────  24.532865 seconds 33.4773%
Stdlibs: ────  48.747497 seconds 66.5203%
Generating precompile statements... 957 generated in  81.815108 seconds (overhead  55.129309 seconds)
julia> @time include("compiler/compiler.jl")
 19.697466 seconds (31.17 M allocations: 1.508 GiB, 4.70% gc time)
 19.200277 seconds (25.49 M allocations: 1.232 GiB, 4.44% gc time)
 19.460540 seconds (25.47 M allocations: 1.231 GiB, 4.39% gc time)

src/julia.h Outdated Show resolved Hide resolved
src/julia.h Outdated Show resolved Hide resolved
src/julia.h Outdated Show resolved Hide resolved
@vtjnash
Copy link
Member Author

vtjnash commented Mar 27, 2019

@JeffBezanson OK to merge? There's additional work I'd like to get started on that builds on this

This'll help solve many of the problems with edges getting
mis-represented and broken by adding an extra level of indirection between
MethodInstance (now really just representing a particular specialization of a
method) and the executable object, now called Lambda (representing some
functional operator that converts some input arguments to some output
values, with whatever metadata is convenient to contain there).
This fixes many of the previous representation problems with back-edges,
since a MethodInstance (like Method) no longer tries to also represent a
computation. That task is now relegated strictly to Lambda.
@vtjnash vtjnash merged commit 8c44566 into master Mar 29, 2019
@vtjnash vtjnash deleted the jn/lambda-edges branch March 29, 2019 16:11
vchuravy added a commit to JuliaLabs/Cassette.jl that referenced this pull request Apr 2, 2019
vchuravy added a commit to JuliaLabs/Cassette.jl that referenced this pull request Apr 3, 2019
vchuravy added a commit to JuliaLabs/Cassette.jl that referenced this pull request Apr 3, 2019

if (internal == 1) {
mi->uninferred = jl_deserialize_value(s, &mi->uninferred);
jl_gc_wb(mi, mi->uninferred);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it looks like the order here was (and remains til this day) reversed from serialization: compare

julia/src/dump.c

Lines 726 to 736 in c3235cd

write_uint8(s->s, internal);
if (!internal) {
// also flag this in the backref table as special
uintptr_t *bp = (uintptr_t*)ptrhash_bp(&backref_table, v);
assert(*bp != (uintptr_t)HT_NOTFOUND);
*bp |= 1;
}
if (internal == 1)
jl_serialize_value(s, (jl_value_t*)mi->uninferred);
jl_serialize_value(s, (jl_value_t*)mi->specTypes);
jl_serialize_value(s, mi->def.value);
with

julia/src/dump.c

Lines 1611 to 1627 in c3235cd

int internal = read_uint8(s->s);
mi->specTypes = (jl_value_t*)jl_deserialize_value(s, (jl_value_t**)&mi->specTypes);
jl_gc_wb(mi, mi->specTypes);
mi->def.value = jl_deserialize_value(s, &mi->def.value);
jl_gc_wb(mi, mi->def.value);
if (!internal) {
assert(loc != NULL && loc != HT_NOTFOUND);
arraylist_push(&flagref_list, loc);
arraylist_push(&flagref_list, (void*)pos);
return (jl_value_t*)mi;
}
if (internal == 1) {
mi->uninferred = jl_deserialize_value(s, &mi->uninferred);
jl_gc_wb(mi, mi->uninferred);
}
.

If this is indeed a bug, I can submit a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants