Support racy initialization of an Executor's state #108

james7132 · 2024-04-01T20:00:10Z

Fixes #89. Uses @notgull's suggestion of using a AtomicPtr with a racy initialization instead of a OnceCell.

For the addition of more unsafe, I added the clippy::undocumented_unsafe_blocks lint at a warn, and fixed a few of the remaining open clippy issues (i.e. Waker::clone_from already handling the case where they're equal).

Removing async_lock as a dependency shouldn't be a SemVer breaking change.

james7132 · 2024-04-01T20:12:08Z

@kettle11 can you test this out in a wasm environment with atomics enabled? This should fix the issue you were running into.

notgull

Overall LGTM

src/lib.rs

notgull

Thanks!

notgull · 2024-04-08T21:17:52Z

Rebase this on master and then I will merge

james7132 · 2024-04-08T21:18:36Z

I'm not sure if we can relax the atomic orderings on the access or not. There's no strict ordering requirements, per se. The worst that happens if it's improperly ordered is it allocates a few extra states that are immediately deallocated.

notgull · 2024-04-08T21:20:42Z

Yeah I think it's fine to only use Acquire ordering to load the pointer. Since there are no side effects to creating a new state allocation and it's cold code anyways, there shouldn't be any issues here.

james7132 · 2024-04-08T21:24:26Z

Could we just use Relaxed then? All that is needed is that the operation of assigning the pointer is atomic.

notgull · 2024-04-08T21:30:56Z

I know there is some weirdness around Relaxed in the C memory model, especially around ordering.

Let's merge this with Acquire ordering for now. Then we can open a new PR for Relaxed ordering, where we can benchmark if it's actually worth it.

james7132 · 2024-04-08T23:45:57Z

Quick check against the benchmarks seems to show a (mostly) positive improvement over the use of async_lock::OnceCell:

executor::create        time:   [1.1075 µs 1.1082 µs 1.1089 µs]
                        change: [-7.9526% -7.8321% -7.7141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

single_thread/executor::spawn_one
                        time:   [1.6380 µs 1.6489 µs 1.6654 µs]
                        change: [-1.2081% +0.7322% +3.0809%] (p = 0.51 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  4 (4.00%) high severe
single_thread/executor::spawn_batch
                        time:   [29.932 µs 31.962 µs 35.088 µs]
                        change: [+5.5198% +15.526% +27.277%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
single_thread/executor::spawn_many_local
                        time:   [5.0103 ms 5.0233 ms 5.0366 ms]
                        change: [-7.9087% -6.4987% -5.2481%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
single_thread/executor::spawn_recursively
                        time:   [30.415 ms 30.888 ms 31.403 ms]
                        change: [-38.044% -36.803% -35.536%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe
single_thread/executor::yield_now
                        time:   [5.2423 ms 5.2992 ms 5.3750 ms]
                        change: [-1.9443% -0.7857% +0.6821%] (p = 0.27 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe

multi_thread/executor::spawn_one
                        time:   [1.4755 µs 1.5024 µs 1.5404 µs]
                        change: [-11.989% -6.9781% -2.6378%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe
multi_thread/executor::spawn_batch
                        time:   [47.541 µs 54.625 µs 66.057 µs]
                        change: [-16.793% -9.5843% -0.0431%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
multi_thread/executor::spawn_many_local
                        time:   [24.540 ms 24.663 ms 24.792 ms]
                        change: [-9.3981% -6.1580% -3.5130%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
Benchmarking multi_thread/executor::spawn_recursively: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 17.3s, or reduce sample count to 20.
multi_thread/executor::spawn_recursively
                        time:   [171.80 ms 172.19 ms 172.63 ms]
                        change: [-29.251% -25.846% -22.158%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe
multi_thread/executor::yield_now
                        time:   [23.836 ms 24.130 ms 24.370 ms]
                        change: [+25.755% +29.244% +32.596%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) low severe
  3 (3.00%) low mild

It may be even faster if we could use a Box instead of an Arc to avoid the costs of atomic increments/decrements, and just be sure to deallcoate all of the tasks before the state when dropping the executor, but we can save that for another PR.

notgull · 2024-04-09T00:06:59Z

That makes sense; we're basically doing what OnceCell is doing but doing more operations at once. Once you rebase on master I can merge.

james7132 · 2024-04-09T00:27:51Z

Should be ready to go.

notgull · 2024-04-09T01:35:40Z

I can't merge, and I can't force-push to your branch, please make sure you do this, where canonical is this repository's remote:

$ git fetch canonical master
$ git rebase canonical/master
$ git push origin racy-initialization --force

james7132 mentioned this pull request Apr 2, 2024

WebAssembly multithreading tracking issue bevyengine/bevy#4078

Open

notgull requested changes Apr 5, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

james7132 requested a review from notgull April 8, 2024 21:00

notgull approved these changes Apr 8, 2024

View reviewed changes

james7132 added 6 commits April 8, 2024 19:34

Support racy initialization of an Executor's state

30c6614

Fix MSRV builds

e184198

No NULL_PTR constant

1e37ec3

Semicolon

9f8a9a7

Fix potential use-after-free

282398d

Use get_mut

14fd269

james7132 force-pushed the racy-initialization branch from 83c4c3b to 14fd269 Compare April 9, 2024 02:34

notgull merged commit 649bdfd into smol-rs:master Apr 9, 2024
8 checks passed

notgull mentioned this pull request Apr 14, 2024

v1.11.0 #114

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support racy initialization of an Executor's state #108

Support racy initialization of an Executor's state #108

james7132 commented Apr 1, 2024 •

edited

Loading

james7132 commented Apr 1, 2024

notgull left a comment

notgull left a comment

notgull commented Apr 8, 2024

james7132 commented Apr 8, 2024

notgull commented Apr 8, 2024

james7132 commented Apr 8, 2024

notgull commented Apr 8, 2024

james7132 commented Apr 8, 2024 •

edited

Loading

notgull commented Apr 9, 2024

james7132 commented Apr 9, 2024

notgull commented Apr 9, 2024

Support racy initialization of an Executor's state #108

Support racy initialization of an Executor's state #108

Conversation

james7132 commented Apr 1, 2024 • edited Loading

james7132 commented Apr 1, 2024

notgull left a comment

Choose a reason for hiding this comment

notgull left a comment

Choose a reason for hiding this comment

notgull commented Apr 8, 2024

james7132 commented Apr 8, 2024

notgull commented Apr 8, 2024

james7132 commented Apr 8, 2024

notgull commented Apr 8, 2024

james7132 commented Apr 8, 2024 • edited Loading

notgull commented Apr 9, 2024

james7132 commented Apr 9, 2024

notgull commented Apr 9, 2024

james7132 commented Apr 1, 2024 •

edited

Loading

james7132 commented Apr 8, 2024 •

edited

Loading