Skip to content

Fix sret alloca alignment to match callee's preferred type alignment#61192

Merged
maleadt merged 1 commit intomasterfrom
tb/sret_align
Mar 4, 2026
Merged

Fix sret alloca alignment to match callee's preferred type alignment#61192
maleadt merged 1 commit intomasterfrom
tb/sret_align

Conversation

@maleadt
Copy link
Member

@maleadt maleadt commented Feb 27, 2026

The caller's sret alloca used julia_alignment (union_align) which can be smaller than the LLVM preferred type alignment that the callee uses for its loads/stores. For example, a struct of floats gets julia_alignment=4 but the callee uses DL.getPrefTypeAlign()=8, generating 8-byte-aligned memcpy operations. On strict-alignment targets (NVPTX), the resulting misaligned access causes CUDA_ERROR_MISALIGNED_ADDRESS.

Fix by computing the sret type's preferred alignment from the callee's StructRet attribute and taking the max with union_align, matching the alignment the callee computes for its sret parameter.

Fixes JuliaGPU/CUDA.jl#3034
Regression introduced in 1.12 by #55730

@maleadt maleadt requested a review from vtjnash February 27, 2026 14:37
@maleadt maleadt added compiler:codegen Generation of LLVM IR and native code gpu Affects running Julia on a GPU backport 1.12 Change should be backported to release-1.12 backport 1.13 Change should be backported to release-1.13 labels Feb 27, 2026
@maleadt maleadt force-pushed the tb/sret_align branch 2 times, most recently from a908e3c to 2464148 Compare March 2, 2026 14:43
@gbaraldi
Copy link
Member

gbaraldi commented Mar 2, 2026

LGTM

@gbaraldi gbaraldi added the merge me PR is reviewed. Merge when all tests are passing label Mar 2, 2026
@maleadt maleadt removed the merge me PR is reviewed. Merge when all tests are passing label Mar 2, 2026
@maleadt maleadt marked this pull request as draft March 3, 2026 08:38
@maleadt maleadt marked this pull request as ready for review March 3, 2026 09:12
@maleadt maleadt requested a review from vtjnash March 3, 2026 09:13
@KristofferC KristofferC mentioned this pull request Mar 3, 2026
56 tasks
@maleadt maleadt requested a review from topolarity March 3, 2026 19:41
The sret parameter's alignment attribute was set to LLVM's preferred type
alignment (getPrefTypeAlign), which can exceed julia_alignment. This caused
misaligned memory accesses on strict-alignment targets like NVPTX, since the
caller's alloca uses julia_alignment. Fix by setting the sret alignment to
julia_alignment and not overriding it in the function definition, so that
caller and callee agree on the same alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Member

@topolarity topolarity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maleadt !

@topolarity topolarity added the merge me PR is reviewed. Merge when all tests are passing label Mar 3, 2026
@maleadt maleadt merged commit f519f3e into master Mar 4, 2026
8 of 9 checks passed
@maleadt maleadt deleted the tb/sret_align branch March 4, 2026 07:05
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Mar 4, 2026
maleadt added a commit that referenced this pull request Mar 6, 2026
The sret parameter's alignment attribute was set to LLVM's preferred type
alignment (getPrefTypeAlign), which can exceed julia_alignment. This caused
misaligned memory accesses on strict-alignment targets like NVPTX, since the
caller's alloca uses julia_alignment. Fix by setting the sret alignment to
julia_alignment and not overriding it in the function definition, so that
caller and callee agree on the same alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
maleadt added a commit that referenced this pull request Mar 6, 2026
The sret parameter's alignment attribute was set to LLVM's preferred type
alignment (getPrefTypeAlign), which can exceed julia_alignment. This caused
misaligned memory accesses on strict-alignment targets like NVPTX, since the
caller's alloca uses julia_alignment. Fix by setting the sret alignment to
julia_alignment and not overriding it in the function definition, so that
caller and callee agree on the same alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@maleadt maleadt removed the backport 1.13 Change should be backported to release-1.13 label Mar 6, 2026
@maleadt maleadt mentioned this pull request Mar 6, 2026
36 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.12 Change should be backported to release-1.12 compiler:codegen Generation of LLVM IR and native code gpu Affects running Julia on a GPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Julia 1.12: Misaligned address error

5 participants