Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profile collects fewer samples than in 0.4 #14281

Closed
JeffBezanson opened this issue Dec 5, 2015 · 4 comments
Closed

profile collects fewer samples than in 0.4 #14281

JeffBezanson opened this issue Dec 5, 2015 · 4 comments
Labels
regression Regression in behavior compared to a previous version

Comments

@JeffBezanson
Copy link
Member

I was profiling #14106, and saw this. Set the delay to 0.002.

On master (run takes 1.2 seconds):

julia> Profile.print()
164 REPL.jl; anonymous; line: 92
 164 REPL.jl; eval_user_input; line: 3
  164 ./boot.jl; eval; line: 263
   164 profile.jl; anonymous; line: 16
    164 array.jl; serialize; line: 228  # also this filename and line num are probably wrong
     164 serialize.jl; serialize; line: 185

On 0.4 (run takes 0.8 seconds):

julia> Profile.print()
411 REPL.jl; anonymous; line: 92
 411 REPL.jl; eval_user_input; line: 62
  411 profile.jl; anonymous; line: 16
   411 serialize.jl; serialize; line: 430
    7   serialize.jl; serialize; line: 184
    402 serialize.jl; serialize; line: 185

The 0.4 profiler got 2.5x more samples even though the test case takes less time.

@vtjnash
Copy link
Member

vtjnash commented Dec 5, 2015

can you provide a test case? my attempted testcase gets faster when I profile it (tested on mach and linux):

julia> x=1
1

julia> f() = (global x; for i = 1:10^8; x+=i; end)
f (generic function with 1 method)

julia> @time for i = 1:10^8; x+=i; end
  5.644128 seconds (200.00 M allocations: 2.980 GB, 5.68% gc time)

julia> @time f()
  6.130925 seconds (200.00 M allocations: 2.980 GB, 4.97% gc time)

julia> @time f()
  6.129639 seconds (200.00 M allocations: 2.980 GB, 4.99% gc time)

julia> @time @profile f()
  5.777516 seconds (200.00 M allocations: 2.980 GB, 5.71% gc time)

julia> @time @profile f()
  5.787910 seconds (200.00 M allocations: 2.980 GB, 5.62% gc time)

@JeffBezanson
Copy link
Member Author

This script produces the exact output I see on linux:

type Particle
    x::Float64
    y::Float64
    z::Float64
    vx::Float64
    vy::Float64
    vz::Float64
end

function build_particles(nParticles)
    particles = Array(Particle, nParticles)
    for i=1:nParticles
        x = 0.0
        y = 0.0
        z = 0.0
        vx = rand()
        vy = rand()
        vz = rand()

        particles[i] = Particle(x, y, z, vx, vy, vz)
    end
    return particles
end

nParticles = 300000
particles = build_particles(nParticles)

b = IOBuffer()
serialize(b, particles)
b = IOBuffer()
Profile.init(delay=.002)
@profile serialize(b, particles)
Profile.print()

@vtjnash
Copy link
Member

vtjnash commented Dec 6, 2015

Are you certain your environment (llvm version) is the same? On my test machine, this code measures as running 10% faster when profiling it (due to an altered gc profile?) -- same for 0.4 and master.

OK, I'm seeing this now, my build was older than I thought. The problem is probably that the linux/posix kernel is really bad at multithreaded profiling (no regression occurred on the mach kernel).

@JeffBezanson JeffBezanson added the regression Regression in behavior compared to a previous version label Dec 8, 2015
@vtjnash
Copy link
Member

vtjnash commented Dec 14, 2015

i think we may be able hack around linux's deficiency here by creating N kernel timers (timer_create) configured for SIGEV_SIGNAL | SIGEV_THREAD_ID and syscall(SYS_gettid). fwiw, it looks like this subject has been the cause of a long running (and active) debate in glibc/pthreads: https://sourceware.org/bugzilla/show_bug.cgi?id=6399

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

2 participants