-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid introducing local variable (and GC frame store) in unsafe_setindex!
#13461
Conversation
................. The extra allocation is due to #13359 ...... |
Note that the code generated for this PR is still better than the one for #13463 since there's no gc frame at all. |
2d39883
to
ec3d2f1
Compare
@yuyichao it would be good to fix the underlying problem but this looks fine, the original code wasn't really idiomatic anyways. |
…r is confused about the return point.
ec3d2f1
to
bb247cf
Compare
Interesting. I didn't realize there was a penalty for |
I did a |
This is awesome. Looks like it speeds up the non-scalar array perf tests by 5-15%. I agree that we should get a better fix for this eventually, but this is great for now. |
Avoid introducing local variable (and GC frame store) in `unsafe_setindex!`
backported in adb832a |
This is a workaround to avoid the generation of a GC frame store when
unsafe_setindex!
is called directly.The problem seems to be that the type inference is not able to figure out the exit point of
@inbounds return ...
. (Or more precisely, the second return is never reachable and the result is not used anyway....)The code above generates a pointer local variable and disables SIMD. There are many fixes/optimizations that we can do to avoid the store in the loop at multiple levels but I think this workaround should be the safest to backport to 0.4 with minimum side effect (not necessarily 0.4.0).
This is an alternative solution to what's in #13459.
It seems that
A[:] = 0.
still generates one allocation whilefill!
doesn't so #13459 might still be good to have (or we should find why isA[:] = 0.
allocate and fix that in general). (edit: It's the splatting panelty, see #13461 (comment))