More instrumentation (memory allocation, also fixes #7259) #7464

timholy · 2014-06-30T11:44:16Z

This adds the ability to track the amount of memory allocated by each line of code. It mimics the --code-coverage functionality, writing a filename.jl.mlc (mlc = malloc, perhaps there's a better extension?) listing the number of bytes allocated by each line.

It's already quite useful, but not perfect, and I confess to not really understanding what's wrong. Here's a test script:

function alloc1()
    s = 0
    for x = 1:3
        s += x
    end
    A = zeros(4000, 7000)
    s2 = sum(A)
    nothing
end

function alloc2()
    s = 0
    for x = 1:3
        s += x/2  # deliberately type-unstable
    end
    s
end

function runner()
    alloc1()
    alloc2()
end

runner()
println("Something")
println("Clearing allocation data to eliminate memory allocated during JITting")
clear_malloc_data()
runner()

which you use like this

julia --track-allocation=user logallocation.jl

Here's the resulting .mlc file: (EDIT: the off-by-one bug has been fixed)

       - function alloc1()
        0     s = 0
        0     for x = 1:3
        0         s += x
        -     end
        0     A = zeros(4000, 7000)
224000048     s2 = sum(A)
        0     nothing
        - end
        - 
        - function alloc2()
        0     s = 0
        0     for x = 1:3
       64         s += x/2  # deliberately type-unstable
        -     end
       32     s
        - end
        - 
        - function runner()
      192     alloc1()
        0     alloc2()
        - end
        - 
        - runner()
        - println("Something")
        - println("Clearing allocation data to eliminate memory allocated during JITting")
        - clear_malloc_data()
        - runner()

The core algorithm is to look for a change in jl_gc_total_bytes since the last line.

Observations:

There seems to be an off-by-one bug even though I attempted to insert the memory-check instructions after each line had already been built.
Anything run from the global scope (for example, a test script in a package's tests/) shows allocation on the first line of the function.

coveralls · 2014-07-01T14:04:41Z

Changes Unknown when pulling 15c9ef3 on teh/malloclog into * on master*.

IainNZ · 2014-07-01T16:11:35Z

Wait --code-coverage works for Base?
Also, this is a really cool idea

coveralls · 2014-07-01T16:27:23Z

Changes Unknown when pulling 09adfd5 on teh/malloclog into * on master*.

jakebolewski · 2014-07-01T16:47:35Z

This is cool but could you turn off the coveralls reporting on every pull request? It seems like it is going to generate a lot of noise.

timholy · 2014-07-01T17:16:02Z

Wait --code-coverage works for Base?

In this PR it does 😄. But there are still issues I'm working out.

This is cool but could you turn off the coveralls reporting on every pull request? It seems like it is going to generate a lot of noise.

I think coverage-testing is going to have to be a separate event. I'm pretty sure you're going to have to test with a single core (at least, until someone implements a way to transport coverage data from one process to another), and I kinda suspect we may even need to disable inlining. If so, it's going to be so freaking slow that we don't want to subject every PR to it.

So, my modifications to .travis.yml are just for my testing (that commit is titled "Dirty hack" for precisely that reason), and won't be part of the final PR. There are still some oddities I'm working out. I apologize in advance for any extra noise.

timholy · 2014-07-02T11:05:04Z

I'm finding myself in a nasty situation where all tests pass on my laptop, they pass on a second machine in my lab (with a different distro and architecture), but I get this segfault when running on Travis. Frustratingly, this is the second time this has happened to me. I'm taking the mildly-heroic step of running the entire test suite using a single core under valgrind. That run will probably take something like 24 hours; in the meantime, does anyone have any ideas? This one is not some kind of file permission error, is it?

carnaval · 2014-07-02T11:07:30Z

In my experience, a polite mail to support@travis will get you ssh credentials to an identical travis vm for 24h, this could help you debugging if the segfault is decently reproducible. Nice piece of work btw.

timholy · 2014-07-02T12:15:05Z

Ooh, that's a great tip, many thanks. Since I'm about to head out of town for the 4th and will have only spotty internet access, I'll wait to use those precious 24hrs until I get back.

timholy · 2014-07-08T21:38:22Z

Got a Makefile/installation problem. Looking for advice from folks like @staticfloat, @tkelman, @ViralBShah, or anyone else who knows more about these matters than me, esp. across platforms.

Bottom line: for code_coverage=all to work, it needs to know where to find base/. What would be the best way to communicate to runtime julia where base/ is? The options I see are built on the observation JULIA_HOME is known at runtime:

In JULIA_HOME put a symlink to base/.
Write a file called $JULIA_HOME/path_to_base that contains a string describing the full path to base/
Compile the value into the julia executable. (This option seems bad because what if someone moves things around after compilation?)

Any advice about which to choose? Or alternative suggestions? Presumably this should be implemented in the Makefile?

(Note: the location of base/, relative to the julia binary, can vary depending on whether we do a make install step. On Travis the binary gets installed in /tmp/julia/bin and the code for base is in /tmp/julia/share/julia/base. That's different from the case where you build & run from the cloned directory, where the binary is in $CLONED_DIR/usr/bin and base is in $CLONED_DIR/base.)

timholy · 2014-07-08T21:48:26Z

(Oh, and thanks to @carnaval about the suggestion to request temporary Travis access. Very helpful here and in #6877.)

tkelman · 2014-07-08T22:19:38Z

It looks like the makefiles are setting up a set of symlinks (or NTFS junctions for anyone who builds from source on Windows) well before make install, so $CLONED_DIR/usr/share/julia/base is a symlink to $CLONED_DIR/base. I think it would be safe to assume base is always at $JULIA_HOME/../share/julia/base.

ViralBShah · 2014-07-08T23:35:19Z

Given that we do have code in our runtime to locate where the .jl files in base are, we probably can use the same directory lookup logic to find the location of base.

timholy · 2014-07-09T01:21:47Z

Ok, that's very helpful.

We have our first test coverage number: 81%. But that number is probably a significant overestimate, see #7541.

timholy · 2014-07-09T08:50:35Z

Hmm, Coveralls prefers to display master, for which we got what was probably a partial result earlier. The actual number seems to be 91%.

timholy · 2014-07-09T12:56:23Z

Working, cleaned up, and ready to merge when the time is right.

StefanKarpinski · 2014-07-09T17:56:00Z

That's amazingly high. Like unbelievably high – even given that this is the percentage of coverage of functions that we actually call while testing. I think that a good way to break this down might be to combine this number with the percentage of functions that we actually test.

StefanKarpinski · 2014-07-09T18:00:16Z

This seems pretty low risk and like it should be merged as soon as is reasonable.

timholy · 2014-07-09T18:28:33Z

This seems pretty low risk and like it should be merged as soon as is reasonable.

Anything that touches the compiler seems like it shouldn't be "low risk," but here I know what you mean.

That's amazingly high. Like unbelievably high

Indeed. I think the main value is for a human looking over the results manually; one can quickly find a bunch of functions that should probably get tests.

I think soon we should implement turning off inlining, but I'm inclined to merge as-is so we can get at least one night's run by PackageEval before fiddling with this more.

Besides, I think the most exciting part of this is the part on memory allocation, and that's not really influenced by all the difficult issues surrounding code-coverage.

More instrumentation (memory allocation, also fixes #7259)

IainNZ · 2014-07-10T02:40:59Z

So @timholy we/I should probably add support to this for Coverage.jl, I'm thinking maybe in a way that sorts lines by memory allocation?

timholy · 2014-07-10T09:32:12Z

@IainNZ, I'll submit a PR.

One thing that never happened was bikeshedding on the name of the output files. Are people happy with the *.mlc extension? Would *.mem be better? Certainly not too late to change at this point.

StefanKarpinski · 2014-07-10T17:16:39Z

I like .mem.

timholy · 2014-07-12T09:52:41Z

Changed to .mem.

timholy mentioned this pull request Jun 30, 2014

WIP: first batch of performance improvements for Gadfly GiovineItalia/Gadfly.jl#346

Merged

timholy mentioned this pull request Jul 2, 2014

Over-reporting of coverage JuliaCI/Coverage.jl#11

Closed

More instrumentation (memory allocation, also fixes #7259)

196dfbc

timholy changed the title ~~WIP: More instrumentation (memory allocation, also fixes #7259)~~ More instrumentation (memory allocation, also fixes #7259) Jul 9, 2014

timholy mentioned this pull request Jul 9, 2014

RFT (request for tips): tweaking .cov files, aka code_coverage output #7541

Closed

timholy added a commit that referenced this pull request Jul 9, 2014

Merge pull request #7464 from JuliaLang/teh/malloclog

a60de70

More instrumentation (memory allocation, also fixes #7259)

timholy merged commit a60de70 into master Jul 9, 2014

timholy deleted the teh/malloclog branch July 24, 2014 00:06

timholy mentioned this pull request Jul 28, 2014

Profiling memory usage #4442

Closed

GunnarFarneback mentioned this pull request Aug 19, 2014

gc weirdness #8055

Closed

timholy mentioned this pull request Oct 23, 2014

WIP: Code coverage reports for base #8781

Closed

vtjnash mentioned this pull request Jun 18, 2015

Fix exported names from gc.c to have jl_gc_ prefix #11741

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More instrumentation (memory allocation, also fixes #7259) #7464

More instrumentation (memory allocation, also fixes #7259) #7464

timholy commented Jun 30, 2014

coveralls commented Jul 1, 2014

IainNZ commented Jul 1, 2014

coveralls commented Jul 1, 2014

jakebolewski commented Jul 1, 2014

timholy commented Jul 1, 2014

timholy commented Jul 2, 2014

carnaval commented Jul 2, 2014

timholy commented Jul 2, 2014

timholy commented Jul 8, 2014

timholy commented Jul 8, 2014

tkelman commented Jul 8, 2014

ViralBShah commented Jul 8, 2014

timholy commented Jul 9, 2014

timholy commented Jul 9, 2014

timholy commented Jul 9, 2014

StefanKarpinski commented Jul 9, 2014

StefanKarpinski commented Jul 9, 2014

timholy commented Jul 9, 2014

IainNZ commented Jul 10, 2014

timholy commented Jul 10, 2014

StefanKarpinski commented Jul 10, 2014

timholy commented Jul 12, 2014

More instrumentation (memory allocation, also fixes #7259) #7464

More instrumentation (memory allocation, also fixes #7259) #7464

Conversation

timholy commented Jun 30, 2014

coveralls commented Jul 1, 2014

IainNZ commented Jul 1, 2014

coveralls commented Jul 1, 2014

jakebolewski commented Jul 1, 2014

timholy commented Jul 1, 2014

timholy commented Jul 2, 2014

carnaval commented Jul 2, 2014

timholy commented Jul 2, 2014

timholy commented Jul 8, 2014

timholy commented Jul 8, 2014

tkelman commented Jul 8, 2014

ViralBShah commented Jul 8, 2014

timholy commented Jul 9, 2014

timholy commented Jul 9, 2014

timholy commented Jul 9, 2014

StefanKarpinski commented Jul 9, 2014

StefanKarpinski commented Jul 9, 2014

timholy commented Jul 9, 2014

IainNZ commented Jul 10, 2014

timholy commented Jul 10, 2014

StefanKarpinski commented Jul 10, 2014

timholy commented Jul 12, 2014