Add memory barriers on lock/unlock of SpinLock by bcardiff · Pull Request #13050 · crystal-lang/crystal

bcardiff · 2023-02-07T13:39:21Z

Fixes #13010. Credits goes to @ggiraldez

Examples provided in #13010 (comment) and #13010 (comment) works in aarch64-apple-darwin without problem so far.

Something I was thinking is that maybe we can use a more relaxed ordering since we are now emitting explicit memory barriers. So not using Atomic which enforces sequentially_consistent in the #swap, #get used in the SpinLock. But that can come later if needed. I am not sure if that would lead to performance improvement at all.

bcardiff · 2023-02-07T14:01:35Z

Until released, this fix can be applied as a monkey-patch as:

class Crystal::SpinLock
  def lock
    previous_def
    ::Atomic::Ops.fence :sequentially_consistent, false 
  end

  def unlock
    ::Atomic::Ops.fence :sequentially_consistent, false 
    previous_def
  end
end

This solution is the same as the one used in crystal-lang#13050. The following code is expected to output `1000000` preceded by the time it took to perform it: ``` mutex = Mutex.new numbers = Array(Int32).new(initial_capacity: 1_000_000) done = Channel(Nil).new concurrency = 20 iterations = 1_000_000 // concurrency concurrency.times do spawn do iterations.times { mutex.synchronize { numbers << 0 } } ensure done.send nil end end start = Time.monotonic concurrency.times { done.receive } print Time.monotonic - start print ' ' sleep 100.milliseconds # Wait just a bit longer to be sure the discrepancy isn't due to a *different* race condition pp numbers.size ``` Before this commit, on an Apple M1 CPU, the array size would be anywhere from 880k-970k, but I never observed it reach 1M. Here is a sample: ``` $ repeat 20 (CRYSTAL_WORKERS=10 ./mutex_check) 00:00:00.119271625 881352 00:00:00.111249083 936709 00:00:00.102355208 946428 00:00:00.116415166 926724 00:00:00.127152583 899899 00:00:00.097160792 964577 00:00:00.120564958 930859 00:00:00.122803000 917583 00:00:00.093986834 954112 00:00:00.079212333 967772 00:00:00.093168208 953491 00:00:00.102553834 962147 00:00:00.091601625 967304 00:00:00.108157208 954855 00:00:00.080879666 944870 00:00:00.114638042 930429 00:00:00.093617083 956496 00:00:00.112108959 940205 00:00:00.092837875 944993 00:00:00.097882625 916220 ``` This indicates that some of the mutex locks were getting through when they should not have been. With this commit, using the exact same parameters (built with `--release -Dpreview_mt` and run with `CRYSTAL_WORKERS=10` to spread out across all 10 cores) these are the results I'm seeing: ``` 00:00:00.078898166 1000000 00:00:00.072308084 1000000 00:00:00.047157000 1000000 00:00:00.088043834 1000000 00:00:00.060784625 1000000 00:00:00.067710250 1000000 00:00:00.081070750 1000000 00:00:00.065572208 1000000 00:00:00.065006958 1000000 00:00:00.061041541 1000000 00:00:00.059648291 1000000 00:00:00.078100125 1000000 00:00:00.050676250 1000000 00:00:00.049395875 1000000 00:00:00.069352334 1000000 00:00:00.063897833 1000000 00:00:00.067534333 1000000 00:00:00.070290833 1000000 00:00:00.067361500 1000000 00:00:00.078021833 1000000 ``` Note that it's not only correct, but also significantly faster.

Add memory barriers on lock/unlock of SpinLock

138de22

Blacksmoke16 added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:concurrency topic:multithreading platform:aarch64 labels Feb 7, 2023

straight-shoota approved these changes Feb 7, 2023

View reviewed changes

straight-shoota added this to the 1.8.0 milestone Feb 7, 2023

beta-ziliani requested a review from ggiraldez February 7, 2023 14:03

ggiraldez approved these changes Feb 7, 2023

View reviewed changes

straight-shoota merged commit 2d49b07 into crystal-lang:master Feb 8, 2023

HertzDevil mentioned this pull request Feb 9, 2023

multithreading might be broken on mac M1 #12872

Closed

jgaskins mentioned this pull request Jan 30, 2024

Add memory barrier to Mutex#unlock on aarch64 #14272

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add memory barriers on lock/unlock of SpinLock#13050

Add memory barriers on lock/unlock of SpinLock#13050
straight-shoota merged 1 commit intocrystal-lang:masterfrom
bcardiff:fix/spin-lock-aarch64

bcardiff commented Feb 7, 2023

Uh oh!

bcardiff commented Feb 7, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

bcardiff commented Feb 7, 2023

Uh oh!

bcardiff commented Feb 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bcardiff commented Feb 7, 2023 •

edited

Loading