-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support type renaming #22721
Support type renaming #22721
Conversation
FWIW, for the debugger, I'd like to have the original text source as well. |
Same here, for a tracing package. Parsing arbitrary source files is no fun. |
Wouldn't it be easier just to unconditionally emit a warning for redefining a type name / constant? |
Revise does, and the user then restarts the session. That's not the end of the world, but if you've got ~1min worth of JITting to get back to where you were, it's not ideal. That's sort of the point of this issue. Whether it's worth the effort it would take to change this is a separate question. |
250822b
to
eb1c523
Compare
I stripped the RFC from this, because I'm actually pretty close to getting this working within Revise (see timholy/Revise.jl#25), so now I genuinely want to merge this. The only part that's missing is a way of going from a
into a method type signature. Funny I've never noticed this before, but do we have an |
function rename_binding(mod::Module, oldname::Symbol, newname::Symbol) | ||
T = getfield(mod, oldname) | ||
ccall(:jl_rename_binding, Void, (Any, Any, Any), mod, oldname, newname) | ||
T.name.name = newname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would need a call to unwrap_unionall
here.
If this is used to redefine a type to have a different number of parameters, then re-evaluating existing method definitions won't generally work. |
I don't think we want to permit deleting an existing binding. We optimize pretty heavily based on the premise that a binding can't be renamed, so this seems still very likely to run into segfaults. At the very least, this method would need to unconditionally print a warning that Julia is now likely to crash. In #8870 (comment), Stefan suggested we permit replacing bindings for constants (with a warning). This allows code that was compiled for the old binding to continue working (kind of), while making it possible to redefine constants (like Type declarations) and probably not usually manage for the environment to keep working. |
Definitely. Will have to throw an informative error, but the nice thing is that in an interactively-revised session you can get there incrementally.
This doesn't delete anything, it just gives the old type a "deprecated" name (making comment characters part of the type name) so that the original name can be (re)used for the new type. It's a better choice than module reloading, see timholy/Revise.jl#18 (comment), where the Old and New modules print identically and thus yield confusing behavior for users when they wonder why old variables (defined using types exported from the Old module) don't trigger the modified methods in the New module. Here, any objects created in terms of Old types will print with their deprecated name, e.g.,
Can you link me to an example? In the admittedly few cases I've looked at, all the ASTs quite gracefully transition over to the "deprecated" name. In the example above, if I define & call julia> bar(x) = x.arg
julia> bar(a)
1
julia> m = first(methods(bar))
bar(x) in Main at REPL[4]:1
julia> m.specializations
TypeMapEntry(nothing, Tuple{#bar,Mine}, nothing, svec(), 22151, -1, MethodInstance for bar(::Mine), true, true, false)
julia> Base.rename_binding(Main, :Mine, Symbol("Mine#RV#1"))
Mine#RV#1
julia> m.specializations
TypeMapEntry(nothing, Tuple{#bar,Mine#RV#1}, nothing, svec(), 22151, -1, MethodInstance for bar(::Mine#1), true, true, false) even though I didn't declare an explicit type for I'm sure I'm thinking about this naively, but my mental picture is that Julia defines the type essentially via a pointer to a struct; I'm not changing that pointer. The struct has a field with a string in it that's the name that gets printed for the user and to compare to keyboarded code. All I'm doing here is changing the string. |
It's a deletion plus an addition from the module names table – our binding table doesn't support versioned renaming. It would be OK to do new naming, but that's just doing
How did the type manage to change names in printing? I thought we print the symbol
Yes, often, but it can do that only because it know that the binding can't change. If you instead did: bar(x) = Mine(x).arg Inference needs to know that that |
OK, admittedly I didn't dig into how this "Dict" is used, but conceptually it's a key-replacement, since the value (the binding) doesn't change. Do keys sometimes escape from the hash table? I guess what you're worried about is
I am, see the diff for
I think I know what you mean, but I'm not sure. Does this logic in Revise address your concern? (not merged to master, it's in timholy/Revise.jl#25). It collects all methods calling As you say, cache invalidation is hard, but here I'm hoping you've already done all the hard work and this is just gravy. |
In the #265 fix, I explicitly don't handle on-stack replacement, which would be required for this. Heuristically, that approach should mostly work currently, since it does cause the system to rebuild most of the state going forward. But I don't think I'm going to keep backedges around – I think the system would be less aggressive at cache invalidation if I stored forward edges, since it can then be more precise at verifying just whether or not the method table inference result changed, and ignore any spurious changes.
I don't see it. Did you forget to git add it? Or is this in a gist? |
Line 114 in eb1c523
|
eb1c523
to
bb6d58d
Compare
I'd like to revive this and therefore added a bunch of tests. I've barely tested it in real-world usage, but together with #33883 and this branch of Revise, things look fairly promising for type-renaming. The only hint of weirdness I've seen is encapsulated here. (EDIT: and one more, constructors don't seem to store backedges, so one can't automatically discover that |
I think it would be very hard to get this working. The issue you have is probably because we don't have backedges for bindings (i.e. the ability to invalidate code if a constant binding changes). Almost everything in the system is accessed via such bindings, so adding those edges seems very expensive to me. |
Can you clarify what circumstances will require invalidation? julia> struct A
x::Int
end
julia> foo(::A) = true
foo (generic function with 1 method)
julia> a = A(2)
A(2)
julia> foo(a)
true
julia> Base.rename_binding(Main, :A, :Aold)
Aold
julia> a
Aold(2)
julia> foo(a)
true
julia> methods(foo)
# 1 method for generic function "foo":
[1] foo(::Aold) in Main at REPL[2]:1 |
Inferred/compiled code that refers to the old binding. |
Can you give me a concrete demo? I thought bindings were resolved at compile time. This is probably completely redundant with the tests I added, but I threw in an attempt at compiled bindings in the module HasA
export A, used, unused
struct A
x::Int
end
used(::A) = true
unused(::A) = true
if ccall(:jl_generating_output, Cint, ()) == 1
@assert precompile(used, (A,))
end
end with test file using HasA, Test
function createA()
A = getfield(HasA, :A)
return A(2)
end
a = createA()
@test used(a)
Base.rename_binding(HasA, :A, :Aold)
@test used(a)
anew = createA()
@test used(anew)
@test unused(a) # not yet compiled
@test unused(anew) Unfortunately 😄 it worked. Both What I'm aiming for (but which may not be possible) is this: old objects get a new typename but in principle all their old methods continue to work---everything acts as if the new name had been used originally. No recompilation or anything is necessary. But this clears space for the user to define a new type with the old name. That new type has no methods other than its constructor. But that's OK, and it's definitely not Base's problem. It's a user problem, or Revise's. (RE)EDIT: to clarify, Revise I think the main question is whether users will be confused by having stale code generate and use old types. If so, I can think of several steps we could take to issue warnings about using old types (examining arguments issued from the command line, invalidating old specializations, putting a hook into inference to check for new compilations with old types, etc). This doesn't have to be merged immediately---we could let people play with this + that Revise branch for a while and see what problems crop up. But since the merge deadline is less than a month away and people are busy writing/reviewing code, I wanted to put this forward with time to address specific concerns. |
OK, I think there's a pretty good way forward, but I could use a little extra info. With this gist I count the number of julia> tc = TypeCounter{Float64}()
TypeCounter{Float64}(0, 0, 0, 0, 0, 0, 0)
julia> @time traverse(tc, Base)
WARNING: Threads.Mutex is deprecated, use ReentrantLock instead.
likely near REPL[5]:1
WARNING: Threads.RecursiveSpinLock is deprecated, use ReentrantLock instead.
likely near REPL[5]:1
0.341304 seconds (3.42 M allocations: 206.899 MiB)
TypeCounter{Float64}(46, 4212, 16174, 48884, 46417, 33002, 2086) So in 1/3 second (which is totally reasonable, and the time is entirely due to If needed, this provides a viable mechanism to discover all the methods that have cached inference results that use a particular type without needing to cache edges. My concern is those |
OK, serious progress. The latest version of the
Other than those caveats, the very preliminary analysis suggests it behaves exactly as you'd expect for a 265-like fix for type redefinition. 🎆 |
One way to look at this is that currently, sort of by luck, if the compiler depends on a certain constant binding, it inlines the value and removes any reference to the binding itself. So if you want that code to keep working the same way, it works automatically. But that is basically depending on an implementation detail. The easiest way to demonstrate the problem now might be with precompiled modules. When package |
Ah, very helpful. FWIW the tests in Revise do exercise precompiled modules, but I first load all the modules that access the binding and then I invalidate the type. The problem you describe would occur if I did it in the opposite order. Can I ask, why don't some methods cache inference results? Is it only the inline-worthy ones that are preserved? |
This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that `const` bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this. # Motivation The reasons for this change are manifold: 1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of #22721. 2. A secondary motivation is to make `const`-redefinition no longer UB (although `const` redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in #54099. 3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (#53958). 4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like #14055. # Implementation In this PR: - `Binding` gets `min_world`/`max_world` fields like `CodeInstance` - Various lookup functions walk this linked list using the current task world_age as a key - Inference accumulates world bounds as it would for methods - Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef - One primary complication is that our IR definition permits `const` globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds. - A new `@world` macro can be used to access bindings from old world ages. This is used in printing for old objects. - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed. Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like: ``` f() = Any g(::f()) = 1 f() = Int ``` Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings. # Demo ``` julia> struct Foo a::Int end julia> g() = Foo(1) g (generic function with 1 method) julia> g() Foo(1) julia> f(::Foo) = 1 f (generic function with 1 method) julia> fold = Foo(1) Foo(1) julia> struct Foo a::Int b::Int end julia> g() ERROR: MethodError: no method matching Foo(::Int64) The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it. Closest candidates are: Foo(::Int64, ::Int64) @ Main REPL[6]:2 Foo(::Any, ::Any) @ Main REPL[6]:2 Stacktrace: [1] g() @ Main ./REPL[2]:1 [2] top-level scope @ REPL[7]:1 julia> f(::Foo) = 2 f (generic function with 2 methods) julia> methods(f) # 2 methods for generic function "f" from Main: [1] f(::Foo) @ REPL[8]:1 [2] f(::@world(Foo, 0:26898)) @ REPL[4]:1 julia> fold @world(Foo, 0:26898)(1) ``` # Performance consideration On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly. # Semantic TODO - [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately? - [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding? # Implementation TODO - [ ] Precompile re-validation - [ ] Various cleanups in the accessors - [ ] Invert the order of the binding linked list to make the most recent one always the head of the list - [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code - [ ] Generated function support
This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that `const` bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this. The reasons for this change are manifold: 1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of #22721. 2. A secondary motivation is to make `const`-redefinition no longer UB (although `const` redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in #54099. 3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (#53958). 4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like #14055. In this PR: - `Binding` gets `min_world`/`max_world` fields like `CodeInstance` - Various lookup functions walk this linked list using the current task world_age as a key - Inference accumulates world bounds as it would for methods - Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef - One primary complication is that our IR definition permits `const` globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds. - A new `@world` macro can be used to access bindings from old world ages. This is used in printing for old objects. - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed. Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like: ``` f() = Any g(::f()) = 1 f() = Int ``` Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings. ``` julia> struct Foo a::Int end julia> g() = Foo(1) g (generic function with 1 method) julia> g() Foo(1) julia> f(::Foo) = 1 f (generic function with 1 method) julia> fold = Foo(1) Foo(1) julia> struct Foo a::Int b::Int end julia> g() ERROR: MethodError: no method matching Foo(::Int64) The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it. Closest candidates are: Foo(::Int64, ::Int64) @ Main REPL[6]:2 Foo(::Any, ::Any) @ Main REPL[6]:2 Stacktrace: [1] g() @ Main ./REPL[2]:1 [2] top-level scope @ REPL[7]:1 julia> f(::Foo) = 2 f (generic function with 2 methods) julia> methods(f) [1] f(::Foo) @ REPL[8]:1 [2] f(::@world(Foo, 0:26898)) @ REPL[4]:1 julia> fold @world(Foo, 0:26898)(1) ``` On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly. - [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately? - [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding? - [ ] Precompile re-validation - [ ] Various cleanups in the accessors - [ ] Invert the order of the binding linked list to make the most recent one always the head of the list - [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code - [ ] Generated function support
Several of us have found Revise.jl to be a significant productivity booster. In principle, it's now possible to keep a Julia session open for a week or more, at which point one of the main negatives of Julia---the cost of JITting your, e.g., plotting package---becomes a non-issue.
However, there are two events which prevent this from being commonplace. One is method-deletion (#20048) and the other is type redefinition (timholy/Revise.jl#18). It occurred to me that type redefinition may not be quite as nasty a problem as I've thought. A demo with this PR:
The only catch I'm aware of is:
However, I wonder if this might be solvable via
methodswith
plus maintaining a cache of the source-code expressions so that they can be re-evaluated. (Revise would like that anyway, since it re-parses and caches every source file so that it can detect diffs. This adds considerably to the package load time, but caching the Exprs to the.ji
file seems both cheap and effective.)