-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "Do not cache the type before we finish constructing it." #11632
Revert "Do not cache the type before we finish constructing it." #11632
Conversation
Perhaps, but did this cause the CI failure? There's not much point in just frobbing the system until intermittent failures temporarily go away. Also, solving only some problems is allowed! |
It looks like all those recent failures are on 32-bit, which is suspicious. |
Yeah. No idea why 32bit... (It makes some sense for GC (higher pressure) but not so much for the type system............) |
The 32-bit |
Another one of my PR (partially reverted by #11621) seems to be exposing sth else as well (mainly because it's been passing on linux and only cause issue on windows after merging with other commits). I was going to investigate with another PR later but maybe some windows/llvm expert can make sense of it in its current state.... |
I have no idea, but this is wasting appveyor time on a bunch of other open PR's. I'd rather not make them fail pointlessly. |
We've seen many of these errors before. On a local win32 build of latest master I get
This is #10875, which was fallout from #10380. I don't pretend to understand this type cache stuff, but it's awfully fragile. We sure could use a more deterministic set of tests for it. For now I think we should merge this. |
Tried a 32b VM with the same ubuntu version as travis, no luck reproducing. The problem is during the evaluation of The If for any reason (OOM ? the type cache has 3/2 geom growth but still this looks like a tiny thing) the apply_type fails in inference it would go through static_eval in codegen on what I believe is a codepath which is not much used since for constant arguments inference should have figured it out already (and this does not check for exceptions!). So either something wrong happens in one of those two cases, or the type cache logic itself is wrong in an almost-right way (since everything mostly works). |
I was also trying to reproduce it and hit #11642 ... |
Is this related to why |
Maybe it's related to the somewhat-special treatment of |
Now that I think of it, it cannot even be a caching problem since subtype falls back on jl_egal of parameters if pointer eq. fails. For example : julia> abstract K
julia> VK = Vector{K}
Array{K,1}
julia> VK.parameters = Base.svec(Any,1); VK
Array{Any,1}
julia> []::VK
0-element Array{Any,1}
julia> works fine (one could argue it should not but thats beside the point), as well as in generated code. So the only way for the error to happen is for those two types to have a different |
So julia> function f()
VK = Vector{K}
VK.parameters = Base.svec(Any, 1)
[]::VK
end
f (generic function with 1 method)
julia> f()
ERROR: TypeError: f: in typeassert, expected Array{Any,1}, got Array{Any,1}
in f at ./none:4 |
Ha. I was wrong indeed, we do codegen pointer equality for leaf typeasserts (emit_typecheck), so it is likely to be a caching problem. The typetag of the object is surely the "right" one since it comes straight from alloc_cell_1d/an_empty_cell. The pointer we are putting in the generated code has probably avoided the cache in some way. |
Revert "Do not cache the type before we finish constructing it."
This has made a day's worth of CI builds fail. Not worth it. Please consider it a little more urgent to keep master functioning, or it disrupts everyone who tries to contribute even in completely unrelated areas. Whatever you do to deal with this, it needs a deterministic way of testing it, or this is going to keep happening over and over again. |
Ok found the issue, thanks to the Travis folks giving ssh into a vm. In the core test we create a I'm not sure it solves the original "error while instantiating" but at least we know why it failed even without it :-) |
By the way I think it wouldn't hurt to have a quick look at all the places that touches those uid since everything can "mostly work" when only a few types are compared wrong. |
@carnaval I would have to read the code before I can fully understand what you meant but from what I understand, shouldn't delaying cache insertion make it less likely to happen? |
The ordering of A{T} in the cache is given by either T->uid or, if it is 0, object_id(T). The way (I think) it went was : |
(then when the cache is not sorted, you can have a false cache miss, say for Vector{Any}, which makes you create a new instance and crash) |
Shouldn't if (!jl_is_abstracttype(type) && ((jl_datatype_t*)type)->uid==0)
((jl_datatype_t*)type)->uid = jl_assign_type_uid(); in |
The UID needs to be assigned before instantiating fields. |
yup, its the uid of a type parameter (of Vector) which is missing for an element being inserted in the Vector cache, not the Vector type itself |
Ahh. I see that the is of the type parameters is used in |
Thanks for the explanation. That part of the code (and the problem) makes much more sense for me now. I'll try to come up with a better PR if you didn't fix it before me =). |
Feel free to go ahead, I had my dose of type cache today ;) |
@yuyichao pretty sure I've seen those on Travis too, those do look quite similar in that type comparison to |
Bump. @yuyichao are you planning to write a patch for this or should I do it? |
https://ci.appveyor.com/project/StefanKarpinski/julia/build/1.0.5748/job/avrcy4volwl25rfx looks probably related? |
@JeffBezanson please go ahead if you want it to be fixed. Sorry if I stopped you from doing it. |
Done. |
Reverts #11606
As @vtjnash points out #11606 (comment) #11606 doesn't solve all the problems (it only solves some cases that has been brought up.) and there's a wierd CI failure that seems to be related = = ....
Sorry about that....