Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is_threading is racy #232

Closed
dnadlinger opened this issue Mar 21, 2022 · 14 comments
Closed

is_threading is racy #232

dnadlinger opened this issue Mar 21, 2022 · 14 comments

Comments

@dnadlinger
Copy link

In is_threading, p.threads_used, a Vector of thread ids, is accessed and updated:

function is_threading(p::AbstractProgress)
Threads.nthreads() == 1 && return false
length(p.threads_used) > 1 && return true
if !in(Threads.threadid(), p.threads_used)
push!(p.threads_used, Threads.threadid())
end
return length(p.threads_used) > 1
end

I don't think Vector is thread-safe, though – wouldn't accesses threads_used also need some form of locking?

@IanButterworth
Copy link
Collaborator

ah, true. It would be good to avoid a lock though, as the reason for this check is to avoid the need for a lock, given that locking got slower in recent julia versions

@dnadlinger
Copy link
Author

dnadlinger commented Mar 26, 2022

Which use case are you most concerned about, performance-wise (that isn't covered by the nthreads() fast path)? I don't see an obvious way to avoid locking altogether, other than avoiding the lock by using a lock-free counter to track work altogether.

Right now, I am personally only really interested in @threads. For this, something like

function is_threading()
  Threads.nthreads() == 1 && return false 

  if Threads.threadid() > 1
    # Called from another thread; so certainly in a @threads region.
    return true
  end

  # On the main thread, so just check if we are in a @threads region.
  return ccall(:jl_in_threaded_region, Cint, ()) != 0
end

might do the trick. This wouldn't work for @spawn/…, though.

In my threaded programs, the progress meter instance is either going to be across threads anyway, or the situation will be such that the overhead from an uncontested lock is insignificant compared to the actual computation, though, so I'm not sure this problem needs to be solved. If it does, perhaps there could be an explicit thread safety flag/choice of variants (say, Progress(…, thread_safe=false)) for when the overhead matters to users?

@dnadlinger dnadlinger changed the title is_threading is racy (?) is_threading is racy Mar 26, 2022
@IanButterworth
Copy link
Collaborator

It's quite expensive to lock each iteration on tight loops. If a user has nthreads() > 1 and uses ProgressMeter on a non-threaded loop, I just felt the lock was worth avoiding.

It may be over-optimization though. Redesign proposals would definitely be welcomed by me!

@dnadlinger
Copy link
Author

I've honestly not used ProgressMeter.jl enough yet to be able to judge what the correct design tradeoffs are. One option to lower the overhead would be to use lock-free atomics instead, but then printing the updates could presumably only be done from the main thread to make sure the updates are ordered correctly. Using Threads.SpinLock might also be cheaper.

@danielwe
Copy link

danielwe commented May 12, 2023

Workaround for anyone else who might be concerned about this:

p = Progress(n)
append!(p.threads_used, 1:Threads.nthreads())
# code using p

As for a proper solution, I think you can get away with only locking in the exceptional case, as follows:

function is_threading(p::AbstractProgress)
    Threads.nthreads() == 1 && return false
    length(p.threads_used) > 1 && return true
    return lock(lk) do
        if !in(Threads.threadid(), p.threads_used)
            push!(p.threads_used, Threads.threadid())
        end
        return length(p.threads_used) > 1
    end 
 end 

That should only occur at most twice during execution.

EDIT: scratch, that, it's not exceptional if Threads.nthreads() > 1 but you're only accessing p from a single thread.

@IanButterworth
Copy link
Collaborator

Yeah on a nthreads() > 1 session a simple hot loop becomes 25x slower with that lock, whereas it's currently at most 9x slower.

I don't yet have a decent design to fix this. Further suggestions appreciated!

@IanButterworth
Copy link
Collaborator

Suggestion by @vchuravy

Update the counter atomically and then try to acquire the lock and then you carry on if you didn't get it
Sink the log into update progress so that you only acquire it when really needed
And use a trylock. Worst case you miss an update

Also from @vtjnash

The fastest is usually a two-split design that stores the initial threadid or Task in one counter, and everything else that isn’t those uses a lock and a different counter

@MarcMush
Copy link
Collaborator

It happened in CI

Stacktrace
ProgressThreads tests: Error During Test at /home/runner/work/ProgressMeter.jl/ProgressMeter.jl/test/test_threads.jl:2
  Got exception outside of a @test
  TaskFailedException
  
      nested task error: BoundsError: attempt to access MemoryRef{Int64} at index [1]
      Stacktrace:
        [1] getindex
          @ ./essentials.jl:882 [inlined]
        [2] iterate (repeats 2 times)
          @ ./array.jl:887 [inlined]
        [3] in
          @ ./operators.jl:1304 [inlined]
        [4] is_threading(p::ProgressThresh{Float64})
          @ ProgressMeter ~/work/ProgressMeter.jl/ProgressMeter.jl/src/ProgressMeter.jl:477
        [5] lock_if_threading
          @ ~/work/ProgressMeter.jl/ProgressMeter.jl/src/ProgressMeter.jl:484 [inlined]
        [6] #update!#18
          @ ~/work/ProgressMeter.jl/ProgressMeter.jl/src/ProgressMeter.jl:532 [inlined]
        [7] update!(p::ProgressThresh{Float64}, val::Float64)
          @ ProgressMeter ~/work/ProgressMeter.jl/ProgressMeter.jl/src/ProgressMeter.jl:531
        [8] macro expansion
          @ ~/work/ProgressMeter.jl/ProgressMeter.jl/test/test_threads.jl:58 [inlined]
        [9] (::var"#618#threadsfor_fun#155"{var"#618#threadsfor_fun#147#156"{Float64, ReentrantLock, UnitRange{Int64}}})(tid::Int64; onethread::Bool)
          @ Main ./threadingconstructs.jl:214
       [10] #618#threadsfor_fun
          @ ./threadingconstructs.jl:181 [inlined]
       [11] (::Base.Threads.var"#1#2"{var"#618#threadsfor_fun#155"{var"#618#threadsfor_fun#147#156"{Float64, ReentrantLock, UnitRange{Int64}}}, Int64})()
          @ Base.Threads ./threadingconstructs.jl:153

https://github.com/MarcMush/ProgressMeter.jl/actions/runs/7719366745/job/21042475069

@vtjnash
Copy link

vtjnash commented Jan 31, 2024

The push! there looks very dangerous when used by multiple threads

@lmiq
Copy link
Contributor

lmiq commented Jul 3, 2024

Just to mention that this issue is now breaking packages (one mine at least) in Julia 1.11.

Thus, a safe fix for this is necessary asap.

@lmiq
Copy link
Contributor

lmiq commented Jul 3, 2024

One alternative could be add a "no_lock" keyword argument, as an opt in, with the appropriate explanation.

@IanButterworth
Copy link
Collaborator

Sounds like a good quickfix

@lmiq
Copy link
Contributor

lmiq commented Jul 4, 2024

I think this update is a reasonable one, which solves the issue with very minor modifications: #321

Relative to what is proposed in #232 (comment), what I do there is just define a new lock and just use the lock if the threads_used needs updating, which will not occur most than twice. I don't see any performance difference in the tight loop of the prog_perf example of the tests.

@MarcMush
Copy link
Collaborator

fixed by #322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants