-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Add scalar support for offload #150288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scalar support for offload #150288
Conversation
d99cf66 to
0330a0e
Compare
|
@Sa4dUs, what's the current state of this? Given the omp/ol expectation of scalars being 64 bit integers I guess this code only works for i64 and usize and everything else will need the casts (or only works by chance)? |
|
it works for all 64-bit scalars. for f64 it's not what omp expects but somehow it just works, not sure it it will break in some weird edge case |
|
☔ The latest upstream changes made this pull request unmergeable. Please resolve the merge conflicts. |
|
☔ The latest upstream changes (presumably #150606) made this pull request unmergeable. Please resolve the merge conflicts. |
0330a0e to
ea71b7b
Compare
ea71b7b to
8b7e6b6
Compare
5189bbb to
45ccec5
Compare
45ccec5 to
38f7fc7
Compare
|
rustbot has assigned @jdonszelmann. Use |
38f7fc7 to
5e980e1
Compare
This comment has been minimized.
This comment has been minimized.
5e980e1 to
272a1a6
Compare
| let mut old_args_rebuilt = Vec::with_capacity(old_param_types.len()); | ||
|
|
||
| for (i, &old_ty) in old_param_types.iter().enumerate() { | ||
| let new_arg = unsafe { llvm::LLVMGetParam(new_fn, (i + 1) as u32) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had missed this one. Otherwise lgtm.
Use compiler/rustc_codegen_llvm/src/llvm/mod.rs
287 /// Safe wrapper around `LLVMGetParam`, because segfaults are no fun.
1 pub(crate) fn get_param(llfn: &Value, index: c_uint) -> &Value {
| // CHECK: store double 4.200000e+01, ptr %0, align 8 | ||
| // CHECK: %_0.i = load double, ptr %0, align 8 | ||
| // CHECK: store double %_0.i, ptr %addr, align 8 | ||
| // CHECK-NEXT: call void @__tgt_register_lib(ptr nonnull %EmptyDesc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While at it, can you please remove the check for : call void @__tgt_register_lib(ptr nonnull %EmptyDesc)? My other open PR moves it into globals, so that would break this test.
| let ty_kind = cx.type_kind(ty); | ||
| let (base_val, gep_base) = match ty_kind { | ||
| TypeKind::Pointer => (v, v), | ||
| TypeKind::Half | TypeKind::Float | TypeKind::Double | TypeKind::Integer => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a FIXME that we should later check for f128 support. At least newer NVIDIA cards should support it.
|
|
||
| // CHECK: define{{( dso_local)?}} void @main() | ||
| // CHECK-NOT: define | ||
| // CHECK: %EmptyDesc = alloca %struct.__tgt_bin_desc, align 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh and also remove this
272a1a6 to
307a4fc
Compare
|
@bors delegate |
|
@bors r=ZuseZ4 |
Add scalar support for offload This PR adds scalar support to the offload feature. The scalar management has two main parts: On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that. On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type. r? @ZuseZ4
Rollup of 8 pull requests Successful merges: - #149587 (coverage: Sort the expansion tree to help choose a single BCB for child expansions) - #150071 (Add dist step for Enzyme) - #150288 (Add scalar support for offload) - #151091 (Add new "hide deprecated items" setting in rustdoc) - #151255 (rustdoc: Fix ICE when deprecated note is not resolved on the correct `DefId`) - #151375 (Fix terminal width dependent tests) - #151384 (add basic `TokenStream` api tests) - #151391 (rustc-dev-guide subtree update) r? @ghost
Rollup merge of #150288 - offload-bench-fix, r=ZuseZ4 Add scalar support for offload This PR adds scalar support to the offload feature. The scalar management has two main parts: On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that. On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type. r? @ZuseZ4
Rollup of 8 pull requests Successful merges: - rust-lang/rust#149587 (coverage: Sort the expansion tree to help choose a single BCB for child expansions) - rust-lang/rust#150071 (Add dist step for Enzyme) - rust-lang/rust#150288 (Add scalar support for offload) - rust-lang/rust#151091 (Add new "hide deprecated items" setting in rustdoc) - rust-lang/rust#151255 (rustdoc: Fix ICE when deprecated note is not resolved on the correct `DefId`) - rust-lang/rust#151375 (Fix terminal width dependent tests) - rust-lang/rust#151384 (add basic `TokenStream` api tests) - rust-lang/rust#151391 (rustc-dev-guide subtree update) r? @ghost
This PR adds scalar support to the offload feature. The scalar management has two main parts:
On the host side, each scalar arg is casted to
ixtype, zero extended toi64and passed to the kernel like that.On the device, the each scalar arg (
i64at that point), is truncated toixand then casted to the original type.r? @ZuseZ4