-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: IdDict replaces ObjectIdDict #25210
RFC: IdDict replaces ObjectIdDict #25210
Conversation
base/abstractdict.jl
Outdated
for (k,v) in pairs; d[k] = v; end | ||
d | ||
end | ||
end | ||
const _IdDict = IdDict{Any,Any} # this is needed to make src/dump.c and src/staticdata.c work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is iffy. I did this to make it compile, see comment below.
src/dump.c
Outdated
@@ -2297,7 +2297,7 @@ JL_DLLEXPORT int jl_save_incremental(const char *fname, jl_array_t *worklist) | |||
htable_new(&backref_table, 5000); | |||
ptrhash_put(&backref_table, jl_main_module, (char*)HT_NOTFOUND + 1); | |||
backref_table_numel = 1; | |||
jl_idtable_type = jl_base_module ? jl_get_global(jl_base_module, jl_symbol("ObjectIdDict")) : NULL; | |||
jl_idtable_type = jl_base_module ? jl_get_global(jl_base_module, jl_symbol("_IdDict")) : NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/staticdata.c
Outdated
@@ -1205,7 +1205,7 @@ static void jl_save_system_image_to_stream(ios_t *f) | |||
} | |||
} | |||
|
|||
jl_idtable_type = jl_base_module ? jl_get_global(jl_base_module, jl_symbol("ObjectIdDict")) : NULL; | |||
jl_idtable_type = jl_base_module ? jl_get_global(jl_base_module, jl_symbol("_IdDict")) : NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rehashing will now need to be done for all instantiations of IdDict
, not just IdDict{Any,Any}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But just replacing it with jl_symbol("IdDict")
does not work. Julia compiles fine but the parallel stuff does not work (and thus the test don't run either). On eac2ff7:
julia> using Distributed; addprocs(2)
From worker 2: Master process (id 1) could not connect within 60.0 seconds.Worker 2 terminated.
ERROR:
From worker 2: exiting.StackOverflowError:
haskey at ./abstractdict.jl:553 [inlined]
is_root_module at ./loading.jl:371 [inlined]
serialize_mod_names(::Distributed.ClusterSerializer{TCPSocket}, ::Module) at ./serialize.jl:355
serialize_mod_names(::Distributed.ClusterSerializer{TCPSocket}, ::Module) at ./serialize.jl:358 (repeats 97 times)
...and 1 more exception(s).
Stacktrace:ERROR (unhandled task failure): Version read failed. Connection closed by peer.
[1] sync_end() at ./task.jl:300
[2] macro expansion at ./task.jl:316 [inlined]
[3] #addprocs_locked#40(::NamedTuple{(),Tuple{}}, ::Function, ::Distributed.LocalManager) at /home/mauro/julia/julia-master/usr/share/julia/site/v0.7/Distributed/src/cluster.jl:401
[4] addprocs_locked at /home/mauro/julia/julia-master/usr/share/julia/site/v0.7/Distributed/src/cluster.jl:372 [inlined]
[5] #addprocs#39(::NamedTuple{(),Tuple{}}, ::Function, ::Distributed.LocalManager) at /home/mauro/julia/julia-master/usr/share/julia/site/v0.7/Distributed/src/cluster.jl:365
[6] addprocs at /home/mauro/julia/julia-master/usr/share/julia/site/v0.7/Distributed/src/cluster.jl:361 [inlined]
[7] #addprocs#268 at /home/mauro/julia/julia-master/usr/share/julia/site/v0.7/Distributed/src/managers.jl:316 [inlined]
[8] addprocs(::Int64) at /home/mauro/julia/julia-master/usr/share/julia/site/v0.7/Distributed/src/managers.jl:315
[9] top-level scope
I think the problem is the re-hashing:
julia> haskey(Base.module_keys, Main) # true about half the time
false
julia> Base.rehash!(Base.module_keys);
julia> haskey(Base.module_keys, Main)
true
base/abstractdict.jl
Outdated
IdDict() = IdDict{Any,Any}() | ||
IdDict(kv::Tuple{}) = IdDict() | ||
|
||
IdDict(ps::Pair{K,V}...) where {K,V} = IdDict{K,V}(ps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need all this; we can just default to Any,Any
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will make IdDict construction different from Dict construction. Is that desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It already is; previously all ObjectIdDicts
were AbstractDict{Any,Any}
s. ObjectIdDicts are also often heterogeneous, and I don't see a need to break those use cases (i.e. where different-typed keys are added to a dict after construction).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make Julia more consistent (c.f. #20402) to have IdDict constructors behave like Dict constructors (and also like Arrays, and presumably many others which try to guess a tight type parameter according to the arguments). Also note that of the uses in Base, only one would need updating toIdDict{Any,Any}(...)
.
That would also make a possible future transition to a more unified dict-backend along the lines of #24932 easier.
base/abstractdict.jl
Outdated
end | ||
|
||
get(t::ObjectIdDict, @nospecialize(key), @nospecialize(default)) = | ||
ccall(:jl_eqtable_get, Any, (Any, Any, Any), t.ht, key, default) | ||
function get(d::IdDict{K,V}, key::K, default) where {K, V} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key doesn't need to be of type K
, since it's ok for it not to be found here. The @nospecialize
annotations can also probably be retained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you'd prefer a KeyError to a MethodError. Would it then also make sense to throw a KeyError for setindex!
?
Also, should I retain all @nospecialize
or just the one you mentioned here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is similar enough to the ArgumentError("$key0 is not a valid key for type $K")
that Dict
throws that we should use the same error here?
And yes, we should keep all the @nospecialize
. The goal of this should just be to allow type parameters in ObjectIdDict
with the minimum changes.
base/abstractdict.jl
Outdated
v = ccall(:jl_eqtable_get, Any, (Any, Any, Any), d.ht, key, default) | ||
v===default ? default : v::V | ||
end | ||
function getindex(d::IdDict{K,V}, key::K) where {K, V} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, there should be a KeyError
here if the key is not of type K
.
base/abstractdict.jl
Outdated
# TODO: this can underestimate `ndel` | ||
val === default || (t.ndel += 1) | ||
return val | ||
if val==default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
===
?
👍 Yes I think this is the right approach. |
Thanks for the comments. However, I'm stuck on #25210 (comment). |
base/abstractdict.jl
Outdated
IdDict{K,V}(d::IdDict{K,V}) where {K, V} = new{K,V}(copy(d.ht)) | ||
function IdDict{K,V}(d::IdDict{K}) where {K, V} | ||
d = IdDict{K,V}() | ||
sizehint!(d, length(pairs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems wrong. Is it a mistaken copy-paste of the second method above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks, the pairs
is wrong. But the method itself is needed:
My thinking here was to provide a method to allow something like:
di = IdDict{Symbol,Int}(:A=>1)
IdDict{Symbol,Float64}(di)
the loop doing the setindex!
will call convert on the integer values of di
.
I'll fix it and add a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it is not needed after all as it will call into the general function IdDict{K,V}(itr) ...
method. Thanks!
eac2ff7
to
a509428
Compare
I can spend some more time on this tomorrow to try and hunt down the right invocation in dump.c and staticdata.c to make automatic re-hashing work again. See #25210 (comment) for details. I suspect that the problem is that However, I have no clue on how to make this work so it will be blind trial and error. If someone finds the time to point me in the right direction, that would be much appreciated. |
a509428
to
c50378f
Compare
After an educational morning digging through the C-part of Julia, I found a way to make it work 🎆. The equality test needed here (and also in dump.c) had to be converted into a subtype test. I did not find a function for this so I took inspiration from, e.g., array subtyping tests here. The test suite now passes without a hickup (barring any rebase hickups before the push), but definitely needs checking by someone who groks this. Other things I did:
|
I'll be away until 3 Jan. In the unlikely event that this acquires urgency, feel free modify this PR. |
This still needs a review, in particular the C-part. |
base/abstractdict.jl
Outdated
pop!(t, key, secret_table_token) | ||
t | ||
function delete!(d::IdDict{K}, @nospecialize(key)) where K | ||
!isa(key, K) && throw(KeyError(key)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any other methods of delete!
throw KeyErrors.
This looks great, let's do it! |
c50378f
to
bca7dd7
Compare
Thanks Jeff. I removed the KeyError from So this is ready to be merged when CI passes, unless it is deemed necessary to run nanosoldier on this, although I think there are no direct ObjectIdDict tests in there. I have re-run some primitive performance tests in https://gist.github.com/mauro3/ada59b1fd9a09202631edccf255c4e1b#gistcomment-2323560, and find no regression. |
- also updated constuctors to be in-line with Dict - added some extra tests
bca7dd7
to
56a5133
Compare
Looks like the one failure on AppVeyor was a time-out (finished at 2h exactly). |
Yes that's happening on basically every build lately. |
The doc builds take so long these days that AppVeyor times out all the time. It's a pretty urgent problem at the moment. cc @mortenpi: it would be great to get a fix – any fix for this merged asap. |
Thinking about it some more, and while updating some code to use this, I wonder if all |
I'd like to argue against this once more (x-ref). Most other datatypes, e.g. Array, Set, Ref, etc. infer their type parameters based on input to the constructors (are there exceptions?). Having
One reason to have the type parameters is to make the functions using an Edit: would it make sense to discuss this over in #20402? Either way, the decision should be taken now. Let me know if you want to change it and I can prepare a PR. |
The consistency argument is pretty compelling, especially given that initializing an |
Fair enough; it's ok to keep it as-is. |
- also updated constuctors to be in-line with Dict - added some extra tests
Third time lucky? This is a the third attempt, after #25196 and #24932. After letting Jeff's comment in #25196 sink in, I decided to not wrap ObjectIdDict but to directly furbish its getters and setters with type assertions. This is cleaner than either of my previous attempts. However, this doesn't quite work yet:
the tests run locally but then the julia process does not exit (CPU is back to 0%). I suspect it has something to do with the C-part.
Also, I removed a lot of the nonspecialize stuff. Not sure what the impact of that is.