Skip to content

Multithreading support for lazy-jit#1166

Merged
bjorn3 merged 4 commits intorust-lang:masterfrom
eggyal:lazy-jit-multithreaded
Jun 25, 2021
Merged

Multithreading support for lazy-jit#1166
bjorn3 merged 4 commits intorust-lang:masterfrom
eggyal:lazy-jit-multithreaded

Conversation

@eggyal
Copy link
Contributor

@eggyal eggyal commented Apr 26, 2021

Fixes #1124, in which @bjorn3 commented:

It should be possible to use a channel between the rustc thread and the thread requesting the compilation and then wait for the function to be compiled.

The problem I had using a channel was that I couldn't get the Sender to the child threads other than through using a global static (in some way), which requires synchronisation... and then the channel just added additional synchronisation on top—which by that point was superfluous. This PR uses a Mutex<VecDeque> for message passing instead.

@eggyal eggyal force-pushed the lazy-jit-multithreaded branch from 80da4a5 to 1b9dcbb Compare April 26, 2021 20:35
@bjorn3
Copy link
Member

bjorn3 commented Apr 27, 2021

Thanks! I will take a look tomorrow.

@eggyal
Copy link
Contributor Author

eggyal commented Apr 27, 2021

Woke this morning with a realisation that it's probably better to use a channel after all—still requires synchronisation to get a clone, but that clone can then be stored thread-local which will then use far less locking.

@eggyal eggyal force-pushed the lazy-jit-multithreaded branch 9 times, most recently from 4bc1872 to 00ccca8 Compare April 27, 2021 15:27
@eggyal eggyal marked this pull request as draft April 27, 2021 15:33
@eggyal
Copy link
Contributor Author

eggyal commented Apr 27, 2021

Ran some tests with more threads and started hitting BAD_ACCESS errors... placing this in draft while I investigate.

@eggyal eggyal force-pushed the lazy-jit-multithreaded branch 2 times, most recently from 96425ef to 23c6b56 Compare April 27, 2021 18:00
@bjorn3
Copy link
Member

bjorn3 commented Apr 28, 2021

I think bytecodealliance/wasmtime#2786 is required for this to work. I need to implement bytecodealliance/wasmtime#2786 (comment) though before it can be merged.

@eggyal
Copy link
Contributor Author

eggyal commented Apr 28, 2021

I removed my previous comments as I'm now encountering different variations... but yes, I'm sure you're right that that's the cause here.

@eggyal
Copy link
Contributor Author

eggyal commented Apr 28, 2021

I extended your branch to implement deferred GOT updates in https://github.com/eggyal/wasmtime/tree/atomic_jit — and that did indeed fix the problem here... but unless Cranelift is to itself run multithreaded, there's no chance of the GOT being updated from multiple threads so I don't think there's any need for AtomicPtr? Certainly other threads are reading from the GOT and you don't want a dirty read, but those aren't atomic loads (unless it's your intention to introduce them?) and don't most (if not all?) architectures guarantee that reading an aligned usize will never be dirty anyway?

Or is the issue about ensuring atomic updates of multiple GOT entries, such as data objects and related functions, so that they're never witnessed in an inconsistent state? Not sure AtomicPtr helps with that either. Maybe the whole GOT needs to be behind a RwLock...

@eggyal eggyal marked this pull request as ready for review April 28, 2021 16:27
@bjorn3
Copy link
Member

bjorn3 commented Apr 28, 2021

Data races are UB in Rust and LLVM. LLVM is for example allowed to write arbitrary values (say 42) before writing the final value. This would crash the jitted code. Data races are not UB in Cranelift. For this reason the rust code needs to use atomic stores to prevent UB, while the Cranelift side doesn't need to use atomic loads to prevent UB. At least on all architectures currently supported by Cranelift.

By the way I think you will need to ensure that two simultaneous requests to compile the same function will result in only a single compilation.

@bjorn3
Copy link
Member

bjorn3 commented Apr 28, 2021

Instead of storing a GotEntryKey in GotEntry could you store an *const AtomicPtr pointing directly to the got entry instead? Also do you want me to incorporate the changes into my PR or do you want to open a replacement PR for Cranelift?

@eggyal
Copy link
Contributor Author

eggyal commented Apr 28, 2021

Data races are UB in Rust and LLVM. LLVM is for example allowed to write arbitrary values (say 42) before writing the final value. This would crash the jitted code. Data races are not UB in Cranelift. For this reason the rust code needs to use atomic stores to prevent UB, while the Cranelift side doesn't need to use atomic loads to prevent UB. At least on all architectures currently supported by Cranelift.

Aha, got it. Thanks.

By the way I think you will need to ensure that two simultaneous requests to compile the same function will result in only a single compilation.

D'oh! Well caught. Considered holding some list of pending *const Instances in a static, but since that would require locking on every jit request it brought me full circle back to the original idea of just handling messages via a static Mutex<VecDeque> and saving on the additional synchronisation provided by a channel. Will push an update shortly.

Instead of storing a GotEntryKey in GotEntry could you store an *const AtomicPtr pointing directly to the got entry instead? Also do you want me to incorporate the changes into my PR or do you want to open a replacement PR for Cranelift?

I've used NonNull<AtomicPtr<u8>>, which is quite neat as it just copies directly through the Options, and of course enforces the additional semantics. Have pushed that one to my branch already. Don't mind however you want to do it... can open a PR to your branch in your repo if that's any help?

@bjorn3
Copy link
Member

bjorn3 commented Apr 28, 2021

I force pushed it directly to my PR branch.

@eggyal eggyal marked this pull request as draft April 28, 2021 23:01
@eggyal eggyal force-pushed the lazy-jit-multithreaded branch 2 times, most recently from 256a892 to c87ea65 Compare April 29, 2021 05:44
@eggyal
Copy link
Contributor Author

eggyal commented Apr 29, 2021

Requires exposure of JITModule::get_got_address in Cranelift, for which I'll submit a PR.

@bjorn3
Copy link
Member

bjorn3 commented Jun 10, 2021

bytecodealliance/wasmtime#2786 has been merged. @eggyal can you please rebase this PR?

@eggyal eggyal force-pushed the lazy-jit-multithreaded branch from ad8b191 to 5c78324 Compare June 17, 2021 08:53
@eggyal eggyal marked this pull request as ready for review June 17, 2021 08:54
@eggyal
Copy link
Contributor Author

eggyal commented Jun 25, 2021

@bjorn3 Not sure whether you saw that I rebased this as requested, but pinging you just in case.

@bjorn3
Copy link
Member

bjorn3 commented Jun 25, 2021

Completely missed it. Thanks for the ping.

@bjorn3
Copy link
Member

bjorn3 commented Jun 25, 2021

Pushed a commit to test this feature.

@bjorn3
Copy link
Member

bjorn3 commented Jun 25, 2021

CI seems to be having some issues with installing wine for windows testing. Merging as I don't expect any failures from this CI job.

@bjorn3 bjorn3 merged commit 0d1cede into rust-lang:master Jun 25, 2021
@eggyal eggyal deleted the lazy-jit-multithreaded branch June 25, 2021 17:40
JohnTitor referenced this pull request in JohnTitor/rust Jul 7, 2021
…jorn3

Sync rustc_codegen_cranelift

The main hightlight this sync is basic support for AArch64. Most things should work on Linux, but there does seem to be an ABI incompatibility causing proc-macros to crash, see https://github.com/bjorn3/rustc_codegen_cranelift/issues/1184. Thanks to `@afonso360` for implementing all Cranelift features that were necessary to compile for AArch64 using cg_clif. Also thanks to `@shamatar` for implementing the `llvm.x86.addcarry.64` and `llvm.x86.subborrow.64` llvm intrinsics used by num-bigint (https://github.com/bjorn3/rustc_codegen_cranelift/pull/1178) and `@eggyal` for implementing multi-threading support for the lazy jit mode. (https://github.com/bjorn3/rustc_codegen_cranelift/pull/1166)

r? `@ghost`

`@rustbot` label +A-codegen +A-cranelift +T-compiler
JohnTitor referenced this pull request in JohnTitor/rust Jul 7, 2021
…jorn3

Sync rustc_codegen_cranelift

The main hightlight this sync is basic support for AArch64. Most things should work on Linux, but there does seem to be an ABI incompatibility causing proc-macros to crash, see https://github.com/bjorn3/rustc_codegen_cranelift/issues/1184. Thanks to ``@afonso360`` for implementing all Cranelift features that were necessary to compile for AArch64 using cg_clif. Also thanks to ``@shamatar`` for implementing the `llvm.x86.addcarry.64` and `llvm.x86.subborrow.64` llvm intrinsics used by num-bigint (https://github.com/bjorn3/rustc_codegen_cranelift/pull/1178) and ``@eggyal`` for implementing multi-threading support for the lazy jit mode. (https://github.com/bjorn3/rustc_codegen_cranelift/pull/1166)

r? ``@ghost``

``@rustbot`` label +A-codegen +A-cranelift +T-compiler
JohnTitor referenced this pull request in JohnTitor/rust Jul 8, 2021
…jorn3

Sync rustc_codegen_cranelift

The main hightlight this sync is basic support for AArch64. Most things should work on Linux, but there does seem to be an ABI incompatibility causing proc-macros to crash, see https://github.com/bjorn3/rustc_codegen_cranelift/issues/1184. Thanks to ```@afonso360``` for implementing all Cranelift features that were necessary to compile for AArch64 using cg_clif. Also thanks to ```@shamatar``` for implementing the `llvm.x86.addcarry.64` and `llvm.x86.subborrow.64` llvm intrinsics used by num-bigint (https://github.com/bjorn3/rustc_codegen_cranelift/pull/1178) and ```@eggyal``` for implementing multi-threading support for the lazy jit mode. (https://github.com/bjorn3/rustc_codegen_cranelift/pull/1166)

r? ```@ghost```

```@rustbot``` label +A-codegen +A-cranelift +T-compiler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support threads for lazy jit mode

3 participants