Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zero out memory of uninitialized fields #9147

Open
StefanKarpinski opened this issue Nov 25, 2014 · 171 comments
Open

zero out memory of uninitialized fields #9147

StefanKarpinski opened this issue Nov 25, 2014 · 171 comments
Assignees

Comments

@StefanKarpinski
Copy link
Member

Raised here: https://groups.google.com/forum/#!topic/julia-users/O5S8pPav5Ks.

@eschnett
Copy link
Contributor

Instead of initializing them to zero, we could initialize them to a large, nonsensical value to help catch access of uninitialized fields.

@StefanKarpinski
Copy link
Member Author

I guess, but this only applies to pure data types like Int or Float64. Frankly, I'm still not entirely convinced that this is a real usability problem, but unlike arrays, there's no good performance reason not to guarantee zeroed memory.

@ViralBShah
Copy link
Member

The zeroing could take a lot of time if you are inserting lots of things into a large tree or a list, or even an array of composite types.

@JeffBezanson
Copy link
Member

Like Stefan, I'm not too worried about performance here. One reason is that uninitialized object fields are relatively rare. new calls that pass all arguments would be unaffected. LLVM might also be able to remove the extra store in x = new(); x.fld = 1. And if the object is heap allocated, the overhead of an extra store would be comparatively small.

One corner case that could cause problems is uninitialized bits fields in immutable types. It's a gotcha if they are zeroed when allocating individual objects, but not when allocating an array of them. Right now we consistently say "bits aren't automatically zeroed". If you like automatic zeroing, you want it everywhere, and doing it sometimes is arguably worse than doing it never.

One way out of that corner case is to disallow explicitly-uninitialized bits fields in immutables. Uninitialized references in immutables have uses (e.g. Nullable{BigMutableThing}), but uninitialized bits fields are less reasonable.

Frankly, I'd rather leave it as-is, or zero everything. For small arrays we can just pay the price, and allocate big arrays directly with mmap. Might not be so bad.

@JeffBezanson JeffBezanson added the needs decision A decision on this change is needed label Nov 25, 2014
@eschnett
Copy link
Contributor

I'd be in favour of initializing everything. If this turns out to be a bottleneck, as measured in a validated benchmark, then we can see whether introducing ccall(:jl_allocate_uninitialized_array) for a few special cases wouldn't do the trick.

@StefanKarpinski StefanKarpinski changed the title zero memory of uninitialized fields zero out memory of uninitialized fields Nov 25, 2014
@stevengj
Copy link
Member

Regarding zero-ing everything, it wouldn't be too hard to change our malloc_a16 function (in gc.c, which is used to allocate arrays) to a calloc_a16 function, which called calloc, shifts the pointer, and stores the original pointer before the pointed-to data. This is how the _aligned_malloc function works on Windows, and how we defined a 16-byte (or 32-byte) aligned malloc for FFTW (which is so trivial I don't think relicensing would be an issue).

@ViralBShah
Copy link
Member

I would prefer zeroing everything rather only in some places - or use a specific byte pattern. I guess we can start by using calloc and validate through a benchmark as suggested.

@ViralBShah
Copy link
Member

Also this is something we can presumably backport to 0.3 for Ron's class, if it all works out.

@tknopp
Copy link
Contributor

tknopp commented Nov 26, 2014

Which would be a pretty drastic semantic change within one version number...

Regarding the actual issue: I think it is a good idea to initialize with zero if the performance degradation are negligible. Would still be good to have a flag to get the malloced array if desired.

@StefanKarpinski
Copy link
Member Author

Which would be a pretty drastic semantic change within one version number...

It's a safe change though since this is not a behavior anyone could reasonably rely on.

@stevengj
Copy link
Member

(@tknopp, you can always call pointer_to_array(convert(Ptr{T}, c_malloc(sizeof(T) * n)), n, true) or similar to get a malloced array, so I don't think we necessarily need a flag. Assuming the overhead of calloc is normally negligible, anyone needing an uninitialized array will be working at such a low level that calling c_malloc won't be unreasonable.)

I tend to agree that people shouldn't rely on this behavior and it probably shouldn't even be documented; they should use zeros(...) if they want guaranteed zero initialization. (Of course, the implementation of zeros in Base can take advantage of it.)

@tknopp
Copy link
Contributor

tknopp commented Nov 27, 2014

@StefanKarpinski: Indeed. Still, I am not sure if backporting features or semantic language changes is a good idea. Its hard to keep track in which version the feature gets in. Or one might even have to distinguish minor version numbers (e.g. 0.33 and 0.34) when a new feature gets in in 0.34. This then has impact for all packages...

@stevengj: While I use ones and zeros myself when initializing an array I think the Array constructor should be a valid way to initialize an array. Currently I am not using it because I want zero initialization. If the constructor would initialize with zero, it would be IMHO the more logical way to create an array. For every other datastructure I also use the constructor.

@stevengj
Copy link
Member

@tknopp, I'm not saying you shouldn't use the constructor. I'm saying that if a calloc version is fast enough then we need not provide a high-level uninitialized-array constructor (nor "a flag" for this).

@stevengj
Copy link
Member

I made an experimental branch that uses calloc instead of malloc, and so far I haven't been able to detect any performance differences (all the differences are swamped by the measurement noise) on MacOS.

@StefanKarpinski
Copy link
Member Author

Interesting and tangentially related: http://research.swtch.com/sparse

@RauliRuohonen
Copy link

Do you want users rely on zero initialization? If yes, best implement and document it so everyone's on the same page. If no, use some nonzero filler like 0b10101010 or just leave it uninitialized like it is today. Facts of life: if you implement zero initialization, users will rely on it, documented or not, whether you want them to or not. Either way, there should be some easy way to get uninitialized memory, like e.g. NumPy has empty() in addition to zeros() and ones() which you can use when you want performance.

@sbromberger
Copy link
Contributor

@RauliRuohonen in the absence of explicit documentation to the contrary (and even then, not guaranteed), users will default to assuming zero initialization. This is the case in Graphs.jl, where dijkstra_shortest_paths can return uninitialized memory (see JuliaAttic/OldGraphs.jl#140 for an example).

This newbie's vote is for zero-by-default, and the sooner it's implemented, the better.

@ViralBShah
Copy link
Member

I personally would prefer a byte pattern if we were to do this.

Also it is quite safe to do this by default, and in the few performance sensitive cases, have a way to get uninitialized memory.

@sbromberger
Copy link
Contributor

I personally would prefer a byte pattern if we were to do this.

I'm genuinely curious - why would a byte pattern be preferable to zeros, especially when new pages are supposedly zeroed by the OS by default?

@JeffBezanson
Copy link
Member

A byte pattern makes it easier to find uses of uninitialized values. The implication is that people must make sure to manually initialize everything, or else they will get some big useless value which at least makes it easier to find the bug.

However, this strikes me as going out of our way to slap people on the wrist. If we are going to put in the effort to guarantee initialization, I'd rather do people a favor and initialize with a likely-useful value (zero). You'd never need to write Foo(x,y) = new(0,0). And given calloc, there might be a performance advantage.

@sbromberger
Copy link
Contributor

they will get some big useless value which at least makes it easier to find the bug

Or, in a worst case, they will get a big useless value that is close enough to an expected value that it slips through, and causes some catastrophic failure down the line?

Unless Julia's going to explicitly test and warn on uninitialized values using this byte pattern (thereby voiding any legitimate uses of that particular pattern), I don't see the advantage - and I see two disadvantages: 1), as you said, calloc() provides an optimized zero, and writing a specific byte pattern might result in poorer peformance; and 2) the principle of "do[ing] what is expected" seems to favor zeros.

@StefanKarpinski
Copy link
Member Author

I think that either doing what we do now or initializing with zeros and having that be a specified, reliable behavior are the two best options.

@johnmyleswhite
Copy link
Member

I think initializing to zeros is really the way to go unless there's a serious performance cost. It simplifies the mental model of how memory allocation works and provides a lot more security.

@sbromberger
Copy link
Contributor

Proposal: zero-fill by default; provide a named parameter for an option to use "raw" malloc for when performance is über-critical.

The security issue is nothing to sneeze at, especially, for example, when building out web services with authenticated sessions. Also, it would make auditing things like Crypto.jl that much more complex.

@sbromberger
Copy link
Contributor

@JeffBezanson can I hope that your self-assignment here means that we're making progress on this? :)

@JeffBezanson
Copy link
Member

It means, at least, that there seems to be a consensus that we should do this before 1.0.

@jebej
Copy link
Contributor

jebej commented Jul 3, 2017

It was mentioned a few times that there would not be a performance penalty for zero initialization, but this currently does not seem to be the case (unless I am misunderstanding something):

julia> @benchmark zeros(10^6)
BenchmarkTools.Trial:
  memory estimate:  7.63 MiB
  allocs estimate:  2
  --------------
  minimum time:     1.458 ms (0.00% GC)
  median time:      1.617 ms (0.00% GC)
  mean time:        2.135 ms (24.87% GC)
  maximum time:     44.558 ms (96.23% GC)
  --------------
  samples:          2333
  evals/sample:     1

julia> @benchmark Vector{Float64}(10^6)
BenchmarkTools.Trial:
  memory estimate:  7.63 MiB
  allocs estimate:  2
  --------------
  minimum time:     876.000 ns (0.00% GC)
  median time:      2.046 μs (0.00% GC)
  mean time:        41.900 μs (84.46% GC)
  maximum time:     41.694 ms (99.64% GC)
  --------------
  samples:          10000
  evals/sample:     1

I'm still having trouble understanding why one would ask for uninitialized memory and then be caught by a "bug" due to that. If you want a zeroed out array, call the zeros function. There is no good reason why 0 is a better "uninitialized" value than any other, and as mentioned above there are cases where 0 does not make much sense (e.g. for a rational 0//0).

It seems like most of the trouble mentioned in this thread could be solved by a find-and-replace of Array with zeros...

@stevengj
Copy link
Member

stevengj commented Jul 3, 2017

@jebej, to eliminate the performance penalty, we have to change the underlying implementation to use calloc. The current zeros will definitely be slower.

@StefanKarpinski
Copy link
Member Author

I'm still having trouble understanding why one would ask for uninitialized memory and then be caught by a "bug" due to that. If you want a zeroed out array, call the zeros function. There is no good reason why 0 is a better "uninitialized" value than any other, and as mentioned above there are cases where 0 does not make much sense (e.g. for a rational 0//0).

Zero is often a pretty good choice for uninitialized memory, e.g. for integers, floats. Most of the time you end up getting zero anyway. Except when you don't. If zero is an ok value and non-zero is a bad value, now you have a rare, non-deterministic bug. Anyone who's had to track down a rare, non-deterministic bug can tell you that this is the worst kind of bug. (Except for rare, non-deterministic bugs that disappear when you look for them – those are even worse.)

In some sense, the "right" solution is to fill uninitialized memory with truly random bytes so that bugs in code that accidentally use uninitialized memory are more reliably forced to happen. However, then you still have a predictability issue since there could still be rare cases where random values are bad. So filling memory with random data doesn't solve the problem either. Moreover, filling memory with random data is really expensive. Filling memory with some fixed value, on the other hand, does solve the problem since if that value is problematic, it's reliably problematic. And if that value is zero, then doing this is essentially free.

Finally, other languages (e.g. C#) which have made the choice to fill uninitialized memory with zeros do not seem to have any regrets about this – there are no blog posts calling this a "billion dollar mistake" or complaining about it at all, for that matter.

@StefanKarpinski
Copy link
Member Author

This is a non-breaking change, so we can move it off the 1.0 milestone.

@ViralBShah
Copy link
Member

Do we still want to keep this issue open?

@StefanKarpinski
Copy link
Member Author

Yes, I still think we should do this. I believe that @JeffBezanson wants to do it as well.

@JeffreySarnoff
Copy link
Contributor

+1

@xgdgsc
Copy link
Contributor

xgdgsc commented Feb 21, 2023

Any progress? Is there a "memset" to 0 on struct I could use to write less zeros in constructor ?

@gbaraldi
Copy link
Member

There are a couple steps we can take here. I believe you can ask LLVM to zero things out for you, and we probably have to do something on the side of the GC. Maybe zero on page initialization/ask for zeroed pages, and then zero it again when freeing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests