Skip to content

Conversation

@Sa4dUs
Copy link
Contributor

@Sa4dUs Sa4dUs commented Dec 22, 2025

This PR adds scalar support to the offload feature. The scalar management has two main parts:

On the host side, each scalar arg is casted to ix type, zero extended to i64 and passed to the kernel like that.
On the device, the each scalar arg (i64 at that point), is truncated to ix and then casted to the original type.

r? @ZuseZ4

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 22, 2025
@ZuseZ4
Copy link
Member

ZuseZ4 commented Dec 30, 2025

@Sa4dUs, what's the current state of this? Given the omp/ol expectation of scalars being 64 bit integers I guess this code only works for i64 and usize and everything else will need the casts (or only works by chance)?
I'm fine with having casts in a follow-up PR, but I guess we should start having more test files in which we check that all the different argument types are being handled correctly, so at least an i64 test for this one?

@Sa4dUs
Copy link
Contributor Author

Sa4dUs commented Dec 30, 2025

it works for all 64-bit scalars. for f64 it's not what omp expects but somehow it just works, not sure it it will break in some weird edge case

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 2, 2026

☔ The latest upstream changes made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Collaborator

bors commented Jan 3, 2026

☔ The latest upstream changes (presumably #150606) made this pull request unmergeable. Please resolve the merge conflicts.

@Sa4dUs Sa4dUs force-pushed the offload-bench-fix branch from 0330a0e to ea71b7b Compare January 3, 2026 17:03
@ZuseZ4 ZuseZ4 mentioned this pull request Jan 6, 2026
5 tasks
@Sa4dUs Sa4dUs force-pushed the offload-bench-fix branch from ea71b7b to 8b7e6b6 Compare January 7, 2026 13:28
@Sa4dUs Sa4dUs force-pushed the offload-bench-fix branch 2 times, most recently from 5189bbb to 45ccec5 Compare January 14, 2026 15:30
@Sa4dUs Sa4dUs changed the title WIP Offload Benchmarks Fixes Add scalar support for offload Jan 14, 2026
@Sa4dUs Sa4dUs force-pushed the offload-bench-fix branch from 45ccec5 to 38f7fc7 Compare January 14, 2026 15:38
@Sa4dUs Sa4dUs marked this pull request as ready for review January 14, 2026 17:56
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 14, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 14, 2026
@rustbot
Copy link
Collaborator

rustbot commented Jan 14, 2026

r? @jdonszelmann

rustbot has assigned @jdonszelmann.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot assigned ZuseZ4 and unassigned jdonszelmann Jan 14, 2026
@Sa4dUs Sa4dUs force-pushed the offload-bench-fix branch from 38f7fc7 to 5e980e1 Compare January 14, 2026 19:42
@rust-log-analyzer

This comment has been minimized.

let mut old_args_rebuilt = Vec::with_capacity(old_param_types.len());

for (i, &old_ty) in old_param_types.iter().enumerate() {
let new_arg = unsafe { llvm::LLVMGetParam(new_fn, (i + 1) as u32) };
Copy link
Member

@ZuseZ4 ZuseZ4 Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had missed this one. Otherwise lgtm.
Use compiler/rustc_codegen_llvm/src/llvm/mod.rs

  287 /// Safe wrapper around `LLVMGetParam`, because segfaults are no fun.
    1 pub(crate) fn get_param(llfn: &Value, index: c_uint) -> &Value {

// CHECK: store double 4.200000e+01, ptr %0, align 8
// CHECK: %_0.i = load double, ptr %0, align 8
// CHECK: store double %_0.i, ptr %addr, align 8
// CHECK-NEXT: call void @__tgt_register_lib(ptr nonnull %EmptyDesc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While at it, can you please remove the check for : call void @__tgt_register_lib(ptr nonnull %EmptyDesc)? My other open PR moves it into globals, so that would break this test.

let ty_kind = cx.type_kind(ty);
let (base_val, gep_base) = match ty_kind {
TypeKind::Pointer => (v, v),
TypeKind::Half | TypeKind::Float | TypeKind::Double | TypeKind::Integer => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a FIXME that we should later check for f128 support. At least newer NVIDIA cards should support it.


// CHECK: define{{( dso_local)?}} void @main()
// CHECK-NOT: define
// CHECK: %EmptyDesc = alloca %struct.__tgt_bin_desc, align 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and also remove this

@Sa4dUs Sa4dUs force-pushed the offload-bench-fix branch from 272a1a6 to 307a4fc Compare January 19, 2026 21:31
@ZuseZ4
Copy link
Member

ZuseZ4 commented Jan 19, 2026

@bors delegate
r=me when CI passes.

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 19, 2026

✌️ @Sa4dUs, you can now approve this pull request!

If @ZuseZ4 told you to "r=me" after making some further change, then please make that change and post @bors r=ZuseZ4.

@Sa4dUs
Copy link
Contributor Author

Sa4dUs commented Jan 19, 2026

@bors r=ZuseZ4

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 19, 2026

📌 Commit 307a4fc has been approved by ZuseZ4

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 19, 2026
Zalathar added a commit to Zalathar/rust that referenced this pull request Jan 20, 2026
Add scalar support for offload

This PR adds scalar support to the offload feature. The scalar management has two main parts:

On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that.
On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type.

r? @ZuseZ4
rust-bors bot pushed a commit that referenced this pull request Jan 20, 2026
Rollup of 8 pull requests

Successful merges:

 - #149587 (coverage: Sort the expansion tree to help choose a single BCB for child expansions)
 - #150071 (Add dist step for Enzyme)
 - #150288 (Add scalar support for offload)
 - #151091 (Add new "hide deprecated items" setting in rustdoc)
 - #151255 (rustdoc: Fix ICE when deprecated note is not resolved on the correct `DefId`)
 - #151375 (Fix terminal  width dependent tests)
 - #151384 (add basic `TokenStream` api tests)
 - #151391 (rustc-dev-guide subtree update)

r? @ghost
@rust-bors rust-bors bot merged commit 1262ff9 into rust-lang:main Jan 20, 2026
11 checks passed
@rustbot rustbot added this to the 1.95.0 milestone Jan 20, 2026
rust-timer added a commit that referenced this pull request Jan 20, 2026
Rollup merge of #150288 - offload-bench-fix, r=ZuseZ4

Add scalar support for offload

This PR adds scalar support to the offload feature. The scalar management has two main parts:

On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that.
On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type.

r? @ZuseZ4
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Jan 20, 2026
Rollup of 8 pull requests

Successful merges:

 - rust-lang/rust#149587 (coverage: Sort the expansion tree to help choose a single BCB for child expansions)
 - rust-lang/rust#150071 (Add dist step for Enzyme)
 - rust-lang/rust#150288 (Add scalar support for offload)
 - rust-lang/rust#151091 (Add new "hide deprecated items" setting in rustdoc)
 - rust-lang/rust#151255 (rustdoc: Fix ICE when deprecated note is not resolved on the correct `DefId`)
 - rust-lang/rust#151375 (Fix terminal  width dependent tests)
 - rust-lang/rust#151384 (add basic `TokenStream` api tests)
 - rust-lang/rust#151391 (rustc-dev-guide subtree update)

r? @ghost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants