Allocations and GC being ignored? #26

andyferris · 2016-11-02T11:03:19Z

I'm finding different results using BenchmarkTools as opposed to my own loops and @time, particularly regarding allocating operations.

I'm benchmarking Base.Array vs the types in StaticArrays.jl. Some of these arrays are immutable, and don't perform GC allocations, and others are mutable. If you are doing lots of small, e.g., matrix multiplications, then the allocations and GC can dominate the cost, and an immutable array is much faster. I have a non-allocating SArray and a (mutable) allocating MArray.

The typical results I were getting using loops and @time showed that when new copies of Array or MArray were created, a lot of time was spent on allocation and GC (but not for SArray):

=====================================
   Benchmarks for 4×4 matrices
=====================================

Matrix multiplication
---------------------
Array               ->  6.526125 seconds (31.25 M allocations: 3.492 GB, 6.61% gc time)
SArray              ->  0.369290 seconds (5 allocations: 304 bytes)
MArray              ->  1.964021 seconds (15.63 M allocations: 2.095 GB, 12.05% gc time)

Matrix multiplication (mutating)
--------------------------------
Array               ->  4.540372 seconds (6 allocations: 576 bytes)
MArray              ->  0.748238 seconds (6 allocations: 448 bytes)

However, I switched my tests to BenchmarkTools.jl and now the difference between SArray and MArray has disappeared. It appears almost like the allocating costs have been ameliorated somehow. Perhaps I'm using the package wrong, but I get:

=====================================================================
 Matrices of size 4×4 and eltype Float64
=====================================================================
SArray:    m3 = m1 * m2 takes 52.00 ns, 144.00 bytes (GC 0.00 ns)
Array:     m3 = m1 * m2 takes 196.00 ns, 240.00 bytes (GC 0.00 ns)
MArray:    m3 = m1 * m2 takes 55.00 ns, 144.00 bytes (GC 0.00 ns)

Array:     A_mul_B!(m3, m1, m2) takes 150.00 ns, 0.00 bytes (GC 0.00 ns)
MArray:    A_mul_B!(m3, m1, m2) takes 20.00 ns, 0.00 bytes (GC 0.00 ns)

The two calls I make are @benchmark *($(copy(m)), $(copy(m))) and @benchmark A_mul_B!($(copy(m)), $(copy(m)), $(copy(m))), where m is some random matrix 4x4 I made out of the above types. Is that the right way to use @benchmark?

The text was updated successfully, but these errors were encountered:

andyferris · 2016-11-02T11:12:40Z

Here I use mean instead of the (default) median:

=====================================================================
 Matrices of size 4×4 and eltype Float64
=====================================================================
SMatrix:    m3 = m1 * m2 takes 87.07 ns, 144.00 bytes (GC 19.19 ns)
Array:      m3 = m1 * m2 takes 258.94 ns, 240.00 bytes (GC 32.41 ns)
MMatrix:    m3 = m1 * m2 takes 87.09 ns, 144.00 bytes (GC 19.11 ns)

Array:      A_mul_B!(m3, m1, m2) takes 153.44 ns, 0.00 bytes (GC 0.00 ns)
MMatrix:    A_mul_B!(m3, m1, m2) takes 20.16 ns, 0.00 bytes (GC 0.00 ns)

Now we see GC (it seems in many case it doesn't run, so the median doesn't detect it) but it really doesn't agree with my benchmark (more GC for MArray and none for SArray).

jrevels · 2016-11-02T14:19:17Z

The BenchmarkTools results seem correct to me; it makes sense that the in-place matrix multiply isn't allocating, and that memory usage is small for small arrays.

Also, it seems that you're directly comparing aggregate measurements vs. BenchmarkTool's individual execution estimates, which are totally different measurements (e.g. these numbers seem to be comparing estimates of a single matmul with direct measurements of a large number of matmuls).

I might be missing something though, since I don't have the actual code. If you believe this is unexpected behavior, can you share the actual code used to generate these numbers (preferably self-contained + whatever packages are required)?

jrevels · 2016-11-08T00:41:28Z

I still don't know about the OP here, but found a case where this issue is real (JuliaLang/julia#19257).

It comes from the nothing return block added in #4. If I remove that, it resolves the issue. I'll open a PR shortly.

andyferris · 2016-11-08T01:11:29Z

By resolves the issue, do you mean reports more allocations and longer run-times?

EDIT: Yes, I see that in the PR. Looks great.

jrevels · 2016-11-08T01:18:33Z

I don't know how #28 affects the issue in the OP here because I can't verify the issue in the OP (I don't have the code).

#28 fixes an issue where the function BenchmarkTools was constructing was "overly" amenable to code ellision (either from LLVM or Julia). Even after that PR, however, it's still simple to construct cases where the ellision will occur (see that PR for an example). I think in such cases, BenchmarkTools is correct to report no allocation, since running such a function "in the wild" should have the same effect given a similar calling environment.

andyferris · 2016-11-08T05:34:52Z

So #28 doesn't seem to affect my results in any meaningful way.

I'm going to have a deeper look sometime but I can't right now. The code is at https://github.com/JuliaArrays/StaticArrays.jl/blob/use-size/perf/benchmark3.jl but that might not be too useful to you (I'll try make a minimal example when I have more time).

KristofferC · 2016-11-08T13:04:56Z

Just a note that #28 fixes a problem we had where some functions were being completely elided. Was just going to file an issue and then it is already fixed. What a luxury.

gdalle · 2023-09-18T18:12:55Z

Is this bug still relevant?

tkoolen mentioned this issue Nov 8, 2016

Creating a view (SubArray) of a Vector allocates memory JuliaLang/julia#19257

Closed

gdalle added the bug label Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allocations and GC being ignored? #26

Allocations and GC being ignored? #26

andyferris commented Nov 2, 2016

andyferris commented Nov 2, 2016

jrevels commented Nov 2, 2016 •

edited

Loading

jrevels commented Nov 8, 2016

andyferris commented Nov 8, 2016 •

edited

Loading

jrevels commented Nov 8, 2016

andyferris commented Nov 8, 2016

KristofferC commented Nov 8, 2016 •

edited

Loading

gdalle commented Sep 18, 2023

Allocations and GC being ignored? #26

Allocations and GC being ignored? #26

Comments

andyferris commented Nov 2, 2016

andyferris commented Nov 2, 2016

jrevels commented Nov 2, 2016 • edited Loading

jrevels commented Nov 8, 2016

andyferris commented Nov 8, 2016 • edited Loading

jrevels commented Nov 8, 2016

andyferris commented Nov 8, 2016

KristofferC commented Nov 8, 2016 • edited Loading

gdalle commented Sep 18, 2023

jrevels commented Nov 2, 2016 •

edited

Loading

andyferris commented Nov 8, 2016 •

edited

Loading

KristofferC commented Nov 8, 2016 •

edited

Loading